Skip to content

Data-Science-for-Linguists-2022/UN-Parallel-Corpora-Analysis

Repository files navigation

United Nations 6 Way Parallel Corpora Analysis

May 1st 2022 By: Kinan Al-Mouk

Goal: Explore the Linguistic Elements of the six Official United Nations' Languages: English, Spanish, French, Russian, Arabic, and Mandarin Chinese.

Data Source: United Nations, Department for General Assembly and Conference Management: UN Parallel Corpora

Summary

This project counts as submission for my term project for LING1340 Data Science for Linguists instructed by Na-Rae Han at the University of Pittsburgh. All data was obtained from the UN website and processed using nltk and SpaCy.

Directory

Guestbook

About

This is Kinan Al-Mouk's term project for LING1340 Data Science for Linguists at the University of Pittsburgh

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published