Animation by [3]
Sentiment analysis on Sarcasm dataset
and text generation on William Shakespere sonnets have been implemened in this repo.
Sentiment analysis is the process of analyzing digital text to determine if the emotional tone of the message is positive, negative, or neutral, however, in our case we are dealing with sarcastic and non sarcastic headlines from the news articles. Today, companies have large volumes of text data like emails, customer support chat transcripts, social media comments, and reviews. Sentiment analysis tools can scan this text to automatically determine the author’s attitude towards a topic. Companies use the insights from sentiment analysis to improve customer service and increase brand reputation. [1]
Text generation is a process where an AI system produces written content, imitating human language patterns and styles. The process involves generating coherent and meaningful text that resembles natural human communication. Text generation has gained significant importance in various fields, including natural language processing, content creation, customer service, and coding assistance. [2]
, in this project, I am using LSTM
based model to generate text based on William Shakespere's sonnet dataset.
sentiment_analysis_and_text_generation
├── __init__.py
├── data
│ └── Sonnet.txt
├── images
│ └── ...
├── README.md
├── requirements.txt
├── sentiment_analysis
│ └── sentiment_analysis.ipynb
├── text_generation
│ └── text_generation.ipynb
└── weights
├── meta.tsv
└── vecs.tsv
Overall structure of the repository.
The section involves downloading the Sarcasm dataset from the internet, preporcesing, training, inferencing:
- Lower casing
- Removing HTML tags and URLs from the text
- Stop words removal
- Lemmization
- Dataset split
- Tokenization
- Sequence Padding
After preprocessing we have set some hyperparameters for training and test, those parameters will influcnece the performance metrics.
General | Model | |||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|||||||||||||||||||||||||||||||||
Fig 1: LSTM based model for sentiment analysis
git clone https://github.com/faizan1234567/sentiment_analysis_and_text_generation
cd sentiment_analysis_and_text_generation
Create and activate Anaconda Environment
conda create -n nlp python=3.9.0
conda activate nlp
Now install all the required dependencies
pip install --upgrade pip
pip install -r requirements.txt
Installation done !
In the text generation step the william shakespere sonnet dataset has been used. The following steps covered.
- Dataset loading
- Lowercasing
- Inspecting dataset
- Tokenization
- Padding to have uniform length sequences
Following the preprocessing, the model is define and trained with the following hypermeters:
General | Model | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|||||||||||||||||||||||||
[1]. Amazon Web servies. (1978). What is ... Amazon. https://aws.amazon.com/what-is/sentiment-analysis/
[2]. Data camp. (1978). Text generation. Amazon. https://www.datacamp.com/blog/what-is-text-generation
[3]. C. (2022, January 5). Introduction to Natural Language Processing(NLP) — Part 1. Medium. https://medium.com/@chinmaychetan04/ introduction-to-natural-language-processing-nlp-part-1-2f2dfbc305d6