Skip to content

This project is composed of a data scraper for Reddit, a data filter for preprocessing the gathered data and several NLP models for classifying the texts in depressive and non-depressive texts.

License

Notifications You must be signed in to change notification settings

TheBlueEngineer/Serene-1.0

Repository files navigation

Serene-1.0

Fiels: Data scraping, data analysis, NLP machine learning

Technologies: Pushshift.io, Python, Keras with TensorFlow, Scikit-learn, Google's Colab

  1. Description of the project
  2. Initial project ideas
  3. Accomplishments

Description of the project

Serene is a web based application that implements a machine learning model meant to detect depressive or non-depressive tendencies in the user's texts. This is a project that I intend to further develop and transform it into a reliable tool for people.

Initial project ideas

  • Scrap data from Reddit.
  • Preprocess the data.
  • Build NLP models for text classification.

Accomplishments

  • Scrap data from Reddit's subreddits with Pushshift.io API.
  • Preprocess the textual data.
  • Build the machine learning models, using Scikit-learn and Keras with Tensorflow.
  • Develop a deeper knowledge of Logistic Regression, SGDClassifiers, Word-level CNNs, Character-level CNNs and the optimization of deep neural networks.
  • Train the models on Google's Colab Jupyter Notebooks, using their backend computations.

About

This project is composed of a data scraper for Reddit, a data filter for preprocessing the gathered data and several NLP models for classifying the texts in depressive and non-depressive texts.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published