Skip to content

Project for Intelligent Systems course which aims to give an overview of the basic steps to perform Natural Language Processing (NLP) with R programming language.

License

Notifications You must be signed in to change notification settings

angeligareta/natural-language-processing-r

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural Language Processing with R

Project for Intelligent Systems course of the EIT Digital data science master at UPM

UPM License

Project summary

This project aims to give an overview about the basic steps to perform Natural Language Processing (NLP) with R programming language.

In the first part of the assignment, the aim is to process a corpus found in the data folder, use a POS tagger and manually check the results for some sentences. From the results, I could conclude the main error is that the POS tagger did not take into account was the proper nouns in both singular and plural forms, such as America or Americans.

In the second part, the goal is to optimize the previous naive POS tagger by adding custom patterns to match certain POS tags, also to study the effect of patterns in terms of precision and recall.

Implementation

The tasks were developed using R programming language, in the format of R markdown to explain every step.

Author

About

Project for Intelligent Systems course which aims to give an overview of the basic steps to perform Natural Language Processing (NLP) with R programming language.

Topics

Resources

License

Stars

Watchers

Forks

Languages