scripts
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
rename_webanno.py renames all files in an unzipped WebAnno project to use the comment counter names used elsewhere for this project. clean_comments.py removes lines beginning with '#' from curated files from a WebAnno project when given the path to a '.../curation' folder in an exported WebAnno project, assuming the project was exported in TSV format. This allows those TSVs to be read by the Pandas package in Python. combine_webanno.py combines all WebAnno TSVs, pulling them from multiple folders into one TSV. webanno_to_sentence.py uses the output of combine_webanno.py, a mapping of comment names, and a constructiveness and toxicity CSV. It combines Attitude, negation, constructiveness, and toxicity annotations into one dataframe, with a row for each sentence of the comment followed by a row for each annotated span in the comment. Constructiveness and toxicity were annotated at the comment level (not the word level), so all sentences/spans within a comment will have a column for whether the comment was constructive/toxic as well. webanno_to_span.py asks for a path to a folder containing the files cleaned by clean_comments.py and writes a new CSV of those annotations combined to a path of your choice. It also reorganizes the data so that each row represents one annotated span (in original files, each row represents a word). It also sorts the rows by their position in the comment (so that the comments can be read from top to bottom in order) and then by the size of the annotation (so that longer spans appear before shorter ones). old_combine_comments.py is the older version of this. appraisal_analysis.R uses the combined dataframe and does different counts and analysis. The analysis is complete, but the code is still messy.