Skip to content

dd-jason-chad/nlp_project

Repository files navigation

README.md

Repository for Natural Language Processing (NLP) project, for Ednalyn C. De Dios, Jason Dunn and Chad Hackney, in the Data Science program at Codeup, San Antonio, Texas.

Repo will contain all project-related materials, to include the final presentation, the primary project Jupyter Notebook, as well as any 'helper' Python files, acquire.py, preparation.py, explore.py and model.py.

Final project presentation: https://docs.google.com/presentation/d/1rmbv9_Fujs50alcOsIi6RGRaCrAp4Akbexeo6D1UrRQ/edit#slide=id.p

Project due date: 13 May 19 Project start: 9 May 19

Project involves: - web-scraping of a designated set of webpages/github repositories; - returning and looking specifically at the associated README.md file for github repo; - combining and analyzing the verbiage/words contained in the README files; - and then modeling the words to predict the language used to generate the contents of each repository.

Github repositories used:

From it's public website, https://www.texastribune.org : "The Texas Tribune is the only member-supported, digital-first, nonpartisan media organization that informs Texans — and engages with them — about public policy, politics, government and statewide issues."

About

Natural Language Processing project, web-scraping, at Codeup, San Antonio, for De Dios, Dunn and Hackney

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published