Skip to content

mtclevans/semantictextanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

PL-Marker Evaluation Scripts

The PL-Marker setup guide provides a pre-trained model and subsequent evaluation scripts
1. Import PL-Marker trained models for SciERC NER and RE
2. Import SciBERT pre-trained model dependencies
3. Import PL-Marker model dependencies: requirement.txt and custom Transformers file
4. Upload SciERC dataset
Note: Ensure the data dir matches the evaluation script
6. Ensure the Runtime type is set to GPU for consistency with the original PL-Marker experiment
7. Use the PL-Marker QuickStart to evaluate the pre-trained PL-Marker NER model for SciERC
Note: Ensure you have created the output dir within the evaluation script
8. Use the PL-Marker QuickStart to evaluate the pre-trained PL-Marker RE model run_re.py for SciERC
Note: Ensure you have created the output dir within the evaluation script

Semantic Text Analysis Framework

Python

1. Import framework dependencies: pandas, NLTK, matplotlib.pyplot, sklearn.metrics.ConfusionMatrixDisplay
2. Import all required functions for the framework
3. Undertake analysis on the Scierc dataset distribution
4. Undertaken confusion matrix analysis for ner and re
5. Generate POS Tags for Qualitative Data Analysis Software import
6. Transform data and generate .xlsx for Qualitative Data Analysis Software import

Semantic Text Analysis

We used Nvivio (REF) on Windows OS.
Two users annotated relations to generate a Kappa score.
Please note: Not all Nvivo features are present on MacOS:
1. Queries will produce different results on Mac OS as they are not filtered by user.
2. Cluster Analysis and Pearson’s Correlation Coefficient (WindowsOS only)
More information can be found here

1. Import exported qdas.xlsx file into Nvivo
2. Undertake Semantic Text Analysis using the codebooks provided
Note: We annotated all themes twice to generate a Kappa statistic.

Results

1. PL-Marker model F1 scores and standard deviation are presented here
2. Database statistics are presented here
3. Confusion matrices are presented here
4. The Nvivo file for the Semantic Text Analysis annotations undertaken in this research is presented here
- Queries for theme co-occurence and validation are present in this Nvivo file under Explore -> Queries -> Query Criteria
- Please note: These results will not match the research when running on MacOS.
- Query results for theme co-occurence and validation are present in this Nvivo file under Explore -> Queries -> Query Results
- Please note: These results will not match the research when running on MacOS.
- Kappa statistics are presented here
- Pearson's Correlation Coefficients must be regenerated at runtime. Please navigate to: Explore -> Diagrams -> Cluster Analysis -> Select codes and click ‘next’ -> Select all codes -> Click ‘Finish’ -> Navigate to ‘Summary’ -> Right-click and ‘Export list’
- Please note: This feature is not available in MacOS

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages