Skip to content

Latest commit

 

History

History
63 lines (36 loc) · 5.61 KB

guestbook_tianyi.md

File metadata and controls

63 lines (36 loc) · 5.61 KB
Tianyi Zheng, tiz65@pitt.edu

Rohan's Notes

Something I liked - I thought it was very interesting how you're looking to study the language learners' syntactic proficiency. That seems to be a more ambitious analysis goal, especially when I think about what we've done in class with esl learners.

Something to improve - There was a good amount of description and analysis, but it was generally quite terse and technical. I wasn't entirely sure what some of the measurements you mentioned were, and you didn't go into detail about them. I would have liked a little more description and consideration of your audience's knowledge level in the analysis.

Something I learned - Your data set was really interesting! I thought it was fascinating how uneven the language counts were compared to the ETS data we've worked with as a class.

@Rohan Thanks for the feedback! I'll definitely try to be clearer with my technical language for the next progress report.

Misha's Feedback

One thing that was done well: I really like your visuals and I'm glad to see that you have an idea of how to operationalize syntactic complexity

One avenue for improvement or suggestion: You might want to include a little bit of info about how these measures of syntactic complexity can be found and quantified in/from the data

One thing I learned: PELIC has POS tagging, that rocks!

Kinan's Notes

Something I liked - The data sample was cool to see because it puts into perspective how you are going to move forward with this and what you're going to be analyzing from your project plan.

Something to improve - The readme could be a bit more organized. Maybe describe each file?

Something I learned -The history plot with seaborn was really cool to see

@Kinan Thanks for the feedback! I'll make sure to add more to the README.

Man Ho's Notes

Something I liked - The topic is very interesting. I wonder how much a learner's English syntax reflects the learner's L1. The difference in syntax among learners of different L1s may be very subtle, particularly if you are comparing learners of the same English proficiency/ level. It is a good idea to look at many metrics at the same time.

Something to improve - I think you can include the definitions of different labels in your data. For example, what are the 'level_id's? How are they related to proficiency level A1 to C2? Besides, the dataset has an unequal distribution of L1s. This will likely affect the accuracy of your predictive analysis, if that's what you plan to do with the data. You may need to figure out how to overcome this class imbalance problem.

Something I learned - TAASCC is a very convenient tool to measure multiply linguistic metrics simultaneously!

Emma's Notes

  • What was done well: The fact that you had already chosen to use the TAASCC in your project plan is very impressive--this makes it very clear exactly how you'll go about adding on to the dataset! This demonstrates an immense amount of foresight and thorough understanding of your project and the methods you'll be using to complete it. Your progress report is very well-written and clear! Visualizations in the data overview give a very good sense of the data's distribution.
  • What could be improved: It would be helpful to have explanations of preexisting data columns (even briefly, since they can probably be found on the original PELIC site). An explanation of the TAASCC and T-Units would also be helpful. Overall, I felt that your code, presentation, and analysis were extremely well-designed, but brief or slightly more detailed explanations of the found data and the tools used could be beneficial!
  • One thing I learned: It's interesting that the TAASSC didn't catch any discourse markers in the essays. I'd love to know exactly how it operationalizes discourse markers--if it's looking for spoken markers like "uh, um, like," etc., this result makes sense, but why not consider written markers (first of all, for example, etc.)? I learned a lot about this syntactic tool, but now I'm extra interested in its inner-workings!

Caroline's entry (2022-04-07)

  • What was done well: This is a super cool project! Your initial exploration of the dataset as well as your exploration of the TAASSC output data were very thorough, and you offered good commentary along with the code that made it fairly easy for me to follow along with even though I am not familiar with the dataset. You also utilized visualizations in meaningful ways.

  • Possible improvements: For your presentation and also to have in your repo, it may be beneficial to have some brief explanations about the PELIC dataset and the various TAASSC measurements because it can all get very technical for a first time visitor.

  • One thing I learned: all about TAASSC! I was very interested to see all the syntactic measurements because we haven't tended to focus on them as much in other assignments.

Alejandro's Feedback

Nice use of the PELIC dataset. I did not know what a T-Unit was until I read your README.md. Something I like is that, in general, your repository is extremely clean and your data is presented nicely in your Jupyter Notebooks. However, I think one improvement could be looking at that weird issue you had with Korean speakers having a high amount of high-level essays by potentially taking multiple samples.

Ben's Feedback

I really like the idea of looking at ESL syntax! Your repo is very well organized; the glossary in the README was a nice touch. In terms of potential improvements, some of the dataframe column names were unclear.