GitHub - mipayne/story-corpus-project

Project: -tokenize, pos_tag, and lemmatize words from children's books -record the average difficulty of the words in each book

Run book_values.py

File_functions: -book_values: contains calculate_book_values function -uses inputs from modify_dictionary_from_excel, importing_txt_files,*_modify -contains code for importing excel files -print statements at end of script for each book print: -title -total points awarded -total words(not including stopwords) -total words(including stopwords) -difficulty (total points awarded / total words (not including stopwords)) -difficulty (total points awarded/ total words (including stopwords))

resources: contains *.xlsx files, and StoryCorpus

modify_dictionary_from_excel.py: converts excel sheets to dictionaries and contains functions to find non-alphanumeric words and create custom_stopwords list

importing_txt_files.py: tokenize, pos_tag, and lemmatize words from StoryCorpus files

fixing_strings (final_words_modifier): concatenates halves of contractions into wholes

punctuation_modify (final_words_modifier): removes punctuation from long strings, removes short strings with punctuation

numeral_word_modify (final_words_modifier): changes arabic numeral strings to their equivalent word

propnoun_modify (final_words_modifier): removes proper nouns from the final words list

stopwords_modify (final_words_modifier): removes stopwords from the final words list

find_propernouns: NOT USED IN BOOK_VALUES. finds propernouns

word_search: NOT USED IN BOOK_VALUES. contains framework for searching for words from books in All and Easy dictionaries by mapping pos_tag to specific keys

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
resources		resources
.gitignore		.gitignore
Book_values_w_book_tot_modify.py		Book_values_w_book_tot_modify.py
README.md		README.md
StoryCorpus_copy		StoryCorpus_copy
book_tot_modify.py		book_tot_modify.py
book_values.py		book_values.py
find_propernouns.py		find_propernouns.py
fixing_strings.py		fixing_strings.py
fixing_strings.pyc		fixing_strings.pyc
importing_txt_files.py		importing_txt_files.py
importing_txt_files.pyc		importing_txt_files.pyc
modify_dictionary_from_excel.py		modify_dictionary_from_excel.py
modify_dictionary_from_excel.pyc		modify_dictionary_from_excel.pyc
numeral_word_modify.py		numeral_word_modify.py
numeral_word_modify.pyc		numeral_word_modify.pyc
propnoun_modify.py		propnoun_modify.py
punctuation_modify.py		punctuation_modify.py
punctuation_modify.pyc		punctuation_modify.pyc
stopwords_modify.py		stopwords_modify.py
stopwords_modify.pyc		stopwords_modify.pyc
word_search.py		word_search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 2

Languages

mipayne/story-corpus-project

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages