<a href="https://colab.research.google.com/github/TinyZhen/326project/blob/master/Assignment_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **`Exploring different pretrained models in Hugging face`**

Objective: The primary goal of this assignment is to familiarize students with the Python programming environment and the use of pretrained models on Hugging Face. This foundational knowledge will be crucial for training and fine-tuning our own models in future assignments.

1.Setup and Requirements Installation


In [None]:
!pip install transformers
!pip install datasets

2. Sentiment Analysis with a Pretrained Model:

We will start with a sentiment analysis task using a pretrained model from Hugging Face. Access the model via this link:
https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest


Exercise 1:

Use the following Python code to perform sentiment analysis. Your task is to modify the text variable with different prompts and observe how the model's sentiment predictions change. Pay attention to preprocessing the text for optimal model performance.

In [None]:
# Set up dependencies and load the model
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax

# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)

MODEL = f"cardiffnlp/twitter-roberta-base-sentiment-latest"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

# Example running case

In [None]:
# Customize your input query
text = "I'm so happy!"

# Preprocess sentence before passing to the model
text = preprocess(text)
encoded_input = tokenizer(text, return_tensors='pt')
# Pass the input to the model and get the raw output
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)

# Print labels and scores
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
    l = config.id2label[ranking[i]]
    s = scores[ranking[i]]
    print(f"{i+1}) {l} {np.round(float(s), 4)}")

# Integrate the sampling code into a single function

In [None]:
def clasifySentence(text):
    # Preprocess sentence before passing to the model
    text = preprocess(text)
    encoded_input = tokenizer(text, return_tensors='pt')
    # Pass the input to the model and get the raw output
    output = model(**encoded_input)
    scores = output[0][0].detach().numpy()
    scores = softmax(scores)

    # Print labels and scores
    ranking = np.argsort(scores)
    ranking = ranking[::-1]
    for i in range(scores.shape[0]):
        l = config.id2label[ranking[i]]
        s = scores[ranking[i]]
        print(f"{i+1}) {l} {np.round(float(s), 4)}")

In [None]:
clasifySentence('Today is a good day')

In [None]:
clasifySentence('I am learning AI today.')

In [None]:
clasifySentence('I feel headache.')

## Exercise 2:

 * Select another sentiment analysis model from Hugging Face and compare its performance with the model used in Exercise 1.

 * Encapsulate the prediction task into a single function, like the example we provided in Exercise 1.

 * Compare the performance of the model with the one in Exercise 1. Document your findings.

List of text classification pretrained models:

https://huggingface.co/models?pipeline_tag=text-classification&sort=trending


In [None]:
# TODO, write your code here

## Exercise 3:

 * Utilize the ResNet 50 pretrained model for image classification. You can access the model through this link:
 https://huggingface.co/microsoft/resnet-50
 * Your task is to encapsulate the prediction task into a single function, like the example we provided in Exercise 1.
 * Pick and upload your own images.
 * Classify and visualize 3 custom images using this model.




In [None]:
# TODO, write your code here