Skip to content

Commit

Permalink
sentiment analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
amaiya committed Apr 21, 2023
1 parent ca40637 commit 05a6109
Show file tree
Hide file tree
Showing 9 changed files with 211 additions and 4 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ Most recent releases are shown at the top. Each release shows:
- **Changed**: Additional parameters, changes to inputs or outputs, etc
- **Fixed**: Bug fixes that don't change documented behaviour

## 0.35.2 (TBD)
## 0.36.dev (TBD)

### new:
- N/A
- easy-to-use-wrapper for sentiment analysis

### changed
- N/A
Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@


### News and Announcements
- **2023-04-21**
- **ktrain 0.36.x** is released and supports **Sentiment Analysis**. See the [example notebook](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/text/sentiment_analysis_example.ipynb) for more information.
```python
# Example: Sentiment Analysis
from ktrain.text.sentiment import SentimentAnalyzer
classifier = SentimentAnalyzer()
result = classifier.predict('I got a promotion today.')
# OUTPUT:
# {'POSITIVE': 0.9021117091178894}
```
- **2023-04-01**
- **ktrain 0.35.x** is released and supports **Generative AI** using an instruction-fine-tuned version of GPT-J that can run on your own machine. See the [example notebook](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/text/generative_ai_example.ipynb) for more information. Supply prompts in the form of instructions for what you want the model to do:
```python
Expand Down Expand Up @@ -378,6 +388,7 @@ can be used out-of-the-box **without** having TensorFlow installed, as summarize
| [Speech Transcription](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/text/speech_transcription_example.ipynb) (pretrained) ||||
| [Image Captioning](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/vision/image_captioning_example.ipynb) (pretrained) ||||
| [Object Detection](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/vision/object_detection_example.ipynb) (pretrained) ||||
| [Sentiment Analysis](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/sentiment_analysis_example.ipynb) (pretrained) ||||
| [Topic Modeling](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/tutorials/tutorial-05-learning_from_unlabeled_text_data.ipynb) (sklearn) ||||
| [Keyphrase Extraction](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/develop/examples/text/keyword_extraction_example.ipynb) (textblob/nltk/sklearn) ||||

Expand Down
2 changes: 2 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ This directory contains various example notebooks using *ktrain*. The directory
- [Universal Information Extraction](#extraction): an example of using a Question-Answering model for information extraction
- [Keyphrase Extraction](#kwextraction): an example of keyphrase extraction in **ktrain**
- [Indonesian Text Examples](#indonesian): examples such as zero-Shot text classification and question-answering on Indonesian text by [Sandy Khosasi](https://github.com/ilos-vigil)
- [Sentiment Analysis Examples](#sentiment): simple-to-use sentiment analysis
- [Generative AI Examples](#generativeai): provide instructions to a language model running on your own machine to solve various tasks
- `vision`:
- [image classification](#imageclass): models for image datasetsimage classification examples using various models and datasets
Expand Down Expand Up @@ -155,6 +156,7 @@ The objective of the CoNLL2003 task is to classify sequences of words as belongi
### <a name="extraction"></a>Universal Information Extraction: [qa_information_extraction.ipynb](https://github.com/amaiya/ktrain/tree/master/examples/text)
### <a name="kwextraction"></a>Keyphrase Extraction: [keyword_extraction_example.ipynb](https://github.com/amaiya/ktrain/tree/master/examples/text)
### <a name="indonesian"></a> [Indonesian NLP examples by Sandy Khosasi](https://github.com/ilos-vigil/ktrain-assessment-study) including Indonesian question-answering, emotion recognition, and document similarity
### <a name="sentiment"></a> Sentiment Analysis: [sentiment_analysis_example.ipynb](https://github.com/amaiya/ktrain/tree/master/examples/text/)
### <a name="generativeai"></a> Generative AI Using GPT: [generative_ai_example.ipynb](https://github.com/amaiya/ktrain/tree/master/examples/text/generative_ai_example.ipynb)


Expand Down
102 changes: 102 additions & 0 deletions examples/text/sentiment_analysis_example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%reload_ext autoreload\n",
"%autoreload 2\n",
"%matplotlib inline\n",
"import os\n",
"os.environ[\"CUDA_DEVICE_ORDER\"]=\"PCI_BUS_ID\";\n",
"os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\"; "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from ktrain.text.sentiment import SentimentAnalyzer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"classifier = SentimentAnalyzer()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"texts = [\"The lower pollen count has provided some relief from my allergies.\", \n",
" \"It looks like there will be cost overruns.\",\n",
" \"I will be at a doctor's appointment at 3:30pm.\",\n",
" \"Tesla stock is falling.\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'POSITIVE': 0.8364812731742859},\n",
" {'NEGATIVE': 0.7623286247253418},\n",
" {'NEUTRAL': 0.9303346276283264},\n",
" {'NEGATIVE': 0.7317317724227905}]"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"classifier.predict(texts)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'POSITIVE': 0.9378765821456909,\n",
" 'NEUTRAL': 0.06050467491149902,\n",
" 'NEGATIVE': 0.0016188238514587283}"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"classifier.predict(\"I got a promotion at work today.\", return_all_scores=True) "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "python3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
1 change: 1 addition & 0 deletions ktrain/text/sentiment/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .core import SentimentAnalyzer
81 changes: 81 additions & 0 deletions ktrain/text/sentiment/core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
from typing import Union
from transformers import pipeline

from ... import utils as U
from ...torch_base import TorchBase


class SentimentAnalyzer(TorchBase):
"""
interface to Sentiment Analyzer
"""

def __init__(self, device=None, **kwargs):
"""
```
ImageCaptioner constructor
Args:
device(str): device to use (e.g., 'cuda', 'cpu')
```
"""

super().__init__(
device=device, quantize=False, min_transformers_version="4.12.3"
)
self.pipeline = pipeline(
"text-classification",
model="cardiffnlp/twitter-roberta-base-sentiment",
device=self.device_to_id(),
**kwargs
)
self.mapping = {
"LABEL_0": "NEGATIVE",
"LABEL_1": "NEUTRAL",
"LABEL_2": "POSITIVE",
}

def predict(
self,
texts: Union[str, list],
return_all_scores=False,
batch_size=U.DEFAULT_BS,
**kwargs
):
"""
```
Performs sentiment analysis
This method accepts a list of texts and predicts their sentiment as either 'NEGATIVE', 'NEUTRAL', 'POSITIVE'.
Args:
texts: str|list
return_all_scores(bool): If True, return all labels/scores
batch_size: size of batches sent to model
Returns:
A dictionary of labels and scores
```
"""
str_input = isinstance(texts, str)
if str_input:
texts = [texts]
chunks = U.batchify(texts, batch_size)
results = []
for chunk in chunks:
preds = self.pipeline(
chunk, top_k=len(self.mapping) if return_all_scores else 1, **kwargs
)
results.extend(preds)
results = [self._flatten_prediction(pred) for pred in results]
return results[0] if str_input else results

def _flatten_prediction(self, prediction: list):
"""
```
flatten prediction to the form {'label':score}
```
"""
return_dict = {}
for d in prediction:
return_dict[self.mapping[d["label"]]] = d["score"]
return return_dict
2 changes: 1 addition & 1 deletion ktrain/version.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
__all__ = ["__version__"]
__version__ = "0.35.2"
__version__ = "0.36.dev"
2 changes: 1 addition & 1 deletion ktrain/vision/object_detection/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ class ObjectDetector(TorchBase):
def __init__(self, device=None, classification=False, threshold=0.9):
"""
```
ImageCaptioner constructor
Object detection constructor
Args:
device(str): device to use (e.g., 'cuda', 'cpu')
Expand Down
10 changes: 10 additions & 0 deletions tests/resources/extra_tests/testrun_ptmodels.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,3 +301,13 @@
result = ic.caption(ifiles)
print(time.time() - start)
print(result)


# sentiment-analysis
from ktrain.text.sentiment import SentimentAnalyzer

classifier = SentimentAnalyzer()
start = time.time()
result = classifier.predict("I got a promotion today.")
print(time.time() - start)
print(result)

0 comments on commit 05a6109

Please sign in to comment.