Skip to content

This project aims to detect Hate Speech against LGBT+ individuals on social media. The project has been built using NLP and Machine Learning algorithms

Notifications You must be signed in to change notification settings

parth9504/Detection-of-Hate-Speech-Against-LGBT-

Repository files navigation

Detection of Hate Speech Against LGBT+ on Social Media

This project aims to detect potential hate speech targeting people of the LGBTQIA+ community. The model is trained using a supervised learning approach, where it learned patterns from labeled data to make predictions on unseen text. For the training of the model ,a dataset containing examples of hate speech and non-hate speech texts related to LGBTQIA+ topics was used.

The project uses key concepts of Natural Language Processing, employs TF-IDF for the vectorization of the text.

For the preprocessing of the text, basic operations like removing URLS,Stop Words, Punctuation marks,Digits was done, followed by tokenization and lemmatization. Post this, frequencies of the most repeating words were plotted on wordclouds. Screenshot (366) Screenshot (367)

TF-IDF (Term Frequency-Inverse Document Frequency) vectorization was used in this project to convert text into numerical features. TF-IDF assigns weights to words based on their frequency in a document relative to their frequency in the entire corpus, helping capture the importance of words in a document.

For the classification, Random Forest was used, which is an ensemble learning method that builds multiple decision trees during training and combines their predictions to make a final prediction.

The project not only provides predictions of hate speech from user input (in the form of text) but also allows scraping of tweets and comments from a YouTube video, as well as uploading CSV files. For scraping tweets, the Python library ntscraper was used, while for YouTube comments, the YouTube API was utilized. Streamlit library was used to build and design the user interface.

Refer to the attached screenshots demonstrating the working of the project. Screenshot (368) Screenshot (369)

The "Youtube Comments Analysis" section allows the users to enter any valid link of a youtube video and scrape the comments to make predictions on that data. The same is displayed on the screen.

Screenshot (370)

The "File Upload" section allows the users to import any csv file which has a 'text' column in it.The textual data is then analysed for the presence of hate speech and the same is then displayed on the screen.

Screenshot (371)

The "Tweets Analysis" section provides the feature of using a term or a valid twitter username, to scrape tweets and make predictions on the scraped data.

Screenshot (372)

About

This project aims to detect Hate Speech against LGBT+ individuals on social media. The project has been built using NLP and Machine Learning algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages