Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 465 Bytes

README.md

File metadata and controls

6 lines (4 loc) · 465 Bytes

Processing files with Stanford CoreNLP - Python module, Stanza

Data used here has been scraped from TripAdvisor airlines (see https://github.com/PeterCaine/Trip-Advisor_Scrape.git for details)

This uses Stanford CoreNLP preprocessing module, Stanza to add information regarding lemma, pop (xpos & upos) as well as dependency relation and dependency head. This is used to convert the reviews into a CoNLL formatted .tsv file for use in a sequence labelling task