<a href="https://colab.research.google.com/github/salmabenhassin/CognoRise-Infotech-Projects/blob/main/sentiment_analysisNLP%26transformer'sPipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

*** Sentiment Analysis with Pipeline from Hugging Face***
The pipelines are a great and easy way to use models for inference. These pipelines ***are objects*** that abstract most of the complex code from the library, offering a simple API dedicated to several tasks,

Here are some commonly used pipelines in the Transformers library:

sentiment-analysis - Sentiment classification.
text-generation - Generate text based on a prompt.
question-answering - Answer questions based on a given context.
translation - Translate text from one language to another.
summarization - Summarize long texts into shorter forms.
ner - Named Entity Recognition (NER) to detect entities like names, organizations, etc.
text-classification - General-purpose text classification.
fill-mask - Fill missing parts in a sentence (masked language modeling).
zero-shot-classification - Classify text without needing specific labeled data.
conversational - Build a chatbot-like conversational system.
feature-extraction - Extract features from text for embeddings.
token-classification - Classify each token in a sentence (e.g., part-of-speech tagging).
These pipelines can be easily initialized using the pipeline function from Hugging Face's Transformers library.

In [None]:
!pip install transformers torch




In [None]:
from transformers import pipeline

# Load a sentiment-analysis pipeline from Hugging Face
nlp = pipeline("sentiment-analysis")

# Example input text
texts = [
    "I love the new design of the website! It's fantastic.",
    "The product broke after one use, I'm so disappointed.",
    "It's okay, but I've seen better."
]

def sentiment_analysis(texts):
    results = nlp(texts)
    for i, result in enumerate(results):
        print(f"Text: {texts[i]}")
        print(f"Sentiment: {result['label']}, Confidence: {result['score']:.4f}\n")

# Analyze sentiment
sentiment_analysis(texts)


No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]



Text: I love the new design of the website! It's fantastic.
Sentiment: POSITIVE, Confidence: 0.9999

Text: The product broke after one use, I'm so disappointed.
Sentiment: NEGATIVE, Confidence: 0.9998

Text: It's okay, but I've seen better.
Sentiment: POSITIVE, Confidence: 0.9863



*** Sentiment Analysis with Chunking, POS tagging, and NER***

In [None]:
!pip install transformers torch spacy
!python -m spacy download en_core_web_sm


Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m34.1 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [None]:
import spacy
from transformers import pipeline

# Load the SpaCy model for chunking, POS tagging, and NER
nlp_spacy = spacy.load("en_core_web_sm")

# Example sentiment lexicon for scoring (can be expanded)
positive_words = ['love', 'fantastic', 'great', 'excellent', 'happy']
negative_words = ['disappointed', 'bad', 'broke', 'sad', 'terrible']

# Custom sentiment analysis based on chunking, POS tagging, and NER
def custom_sentiment_analysis(text):
    doc = nlp_spacy(text)

    # Initialize sentiment score
    sentiment_score = 0

    # Chunking: Analyze noun phrases
    print("\nChunking (noun phrases):")
    for chunk in doc.noun_chunks:
        print(f"  - {chunk.text}")
        sentiment_score += analyze_sentiment_in_chunk(chunk.text)

    # POS Tagging: Analyze adjectives and verbs
    print("\nPOS tagging (adjectives and verbs):")
    for token in doc:
        if token.pos_ in ['ADJ', 'VERB']:  # Focus on adjectives and verbs
            print(f"  - {token.text}: {token.pos_}")
            sentiment_score += analyze_sentiment_in_token(token.text)

    # NER: Analyze named entities
    print("\nNamed Entities (NER):")
    for ent in doc.ents:
        print(f"  - {ent.text}: {ent.label_}")
        sentiment_score += analyze_sentiment_in_chunk(ent.text)

    # Return final sentiment based on the score
    if sentiment_score > 0:
        return "POSITIVE"
    elif sentiment_score < 0:
        return "NEGATIVE"
    else:
        return "NEUTRAL"

def analyze_sentiment_in_chunk(chunk_text):
    score = 0
    words = chunk_text.split()
    for word in words:
        if word.lower() in positive_words:
            score += 1
        elif word.lower() in negative_words:
            score -= 1
    return score

def analyze_sentiment_in_token(token_text):
    # This function gives the sentiment score based on individual tokens
    if token_text.lower() in positive_words:
        return 1
    elif token_text.lower() in negative_words:
        return -1
    return 0

# Test sentences
texts = [
    "I love the new design of the website! It's fantastic.",
    "The product broke after one use, I'm so disappointed.",
    "It's okay, but I've seen better."
]

# Analyze sentiment with custom chunking, POS tagging, and NER influence
for text in texts:
    print(f"\nText: {text}")
    sentiment = custom_sentiment_analysis(text)
    print(f"Custom Sentiment: {sentiment}")




Text: I love the new design of the website! It's fantastic.

Chunking (noun phrases):
  - I
  - the new design
  - the website
  - It

POS tagging (adjectives and verbs):
  - love: VERB
  - new: ADJ
  - fantastic: ADJ

Named Entities (NER):
Custom Sentiment: POSITIVE

Text: The product broke after one use, I'm so disappointed.

Chunking (noun phrases):
  - The product
  - one use
  - I

POS tagging (adjectives and verbs):
  - broke: VERB
  - disappointed: ADJ

Named Entities (NER):
  - one: CARDINAL
Custom Sentiment: NEGATIVE

Text: It's okay, but I've seen better.

Chunking (noun phrases):
  - It
  - I

POS tagging (adjectives and verbs):
  - okay: ADJ
  - seen: VERB

Named Entities (NER):
Custom Sentiment: NEUTRAL
