Skip to content

R script to tokenize, split and arrange sentences by order of complexity

License

Notifications You must be signed in to change notification settings

afonsoxavier/order_sentences

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

order_sentences

R script to tokenize, split and arrange sentences by order of complexity

ordersentences.R tokenizes and splits a text. Then calculates and ordenates sentences according to length and most common terms. The results of the script can be seen by running it and typing the commands as mentioned in the comments at the end of the script. The text used as example is cyntaf2.txt. Any other text can be used by changing the name of the file in the script ordersentences.R

cmu_wfreq.R runs a series of analysis and shows graphs mainly following Zipf's laws. As an example and to facilitate the analysis, brawddeg.csv and terms.csv are here added as data files (they contain the same data as the dataframes obtained with ordersentences.R).

About

R script to tokenize, split and arrange sentences by order of complexity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages