<p align = "center" draggable=”false” ><img src="https://user-images.githubusercontent.com/37101144/161836199-fdb0219d-0361-4988-bf26-48b0fad160a3.png" 
     width="200px"
     height="auto"/>
</p>

# Reddit and HuggingFace Starter Kit

## Part I: [Reddit API](https://www.reddit.com/dev/api/)
The first part of this excercise is to figure out how to instantiate a Reddit API object using the Python Reddit API Wrapper [PRAW](https://praw.readthedocs.io/en/stable/).  PRAW is a Python library that provides a simple interfaceto interact with the Reddit API.

### Your Task
You will first need to instantiate a [Reddit instance](https://praw.readthedocs.io/en/stable/code_overview/reddit_instance.html).
Hint: you only need to use `client_id`, `client_secret`, and `user_agent`

#### Make sure everyone in the group does this part! 

Follow the guide below on how to get your `client_id` and `client_secret`.

#### Follow these steps:
1. Pull the `FourthBrain/ML03` repo locally so you can start development.
2. Open `reddit_and_huggingface.ipynb` and install the necessary packages for this lesson by running:

    ```
    cd code_student/Week_2
    conda activate {your_virtual_environment_name}
    pip install transformers praw torch torchvision torchaudio
    ```
    
3. Obtain your `client_id` and `client_secret`

* Make a Reddit account
* Follow the steps in this screenshot which are the first steps from this [guide](https://towardsdatascience.com/how-to-use-the-reddit-api-in-python-5e05ddfd1e5c).

![instructions to set up reddit api](../../images/reddit_get_access.JPG)

* Create a `secrets.py` file and include the following:

    ```
    REDDIT_API_CLIENT_ID = ""
    REDDIT_API_CLIENT_SECRET = ""
    REDDIT_API_USER_AGENT = {can_be_any_string...for ex: "teslabot"}
    ```
    Get it?  [Teslabot :)](https://www.tesla.com/AIhttps://www.tesla.com/AI)
    

* Put `secrets.py` in `Week_2` so you can easily import it

4. Complete the code in the `# YOUR CODE HERE` space below that creates a reddit instance object that allows us to interact with the Reddit API.  Note that the `subreddit` object for the 'r/TSLA' subreddit has already been created for you.

In [1]:
import praw
from transformers import pipeline
import secrets
from requests import Session
session = Session()

reddit = praw.Reddit(
    client_id=secrets.REDDIT_API_CLIENT_ID,
    client_secret=secrets.REDDIT_API_CLIENT_SECRET,
    requestor_kwargs={"session": session},  # pass Session
    user_agent=secrets.REDDIT_API_USER_AGENT,
    username="MLP_Tesla",
)

subreddit = reddit.subreddit('TSLA')

In [4]:
for submission in reddit.subreddit("TSLA").hot(limit=10):
    print(submission.title)

TSLA PT $1400
Behind GM, Ford’s aggressive new electric vehicle strategy is old-time financing: Cash
Longtime Tesla bull Ron Baron plans to hold the stock at least another 8 years
Tesla to open Texas factory critical to growth ambitions
$TSLA surged down 2 days in a row, for all option traders, do you have Calls or Puts at this very moment?
Direct registering shares
Surge 🚀
LFG TSLA!
Musk could do, “almost anything”
Tesla reports Q1 results - Full Coverage


## Part II:  [r/TSLA Subreddit](https://www.reddit.com/r/TSLA/)
The second part of this exercise is to figure out how to the following code is parsing comments through use of the r/TSLA `subreddit` instance object.

### Your Task
1. Work with your group to comment each line of the following code so that you describe what each piece is doing.
2. Create one comment at the top of the code that describes what the larger for loop is iterating over.  
3. (Optional) How many comments will I get from this?

A few resources that might help!
* How do I find the top 10 posts of all time from your favorite subreddit(s)? (hint: look at ["Obtain Submission Instances from a Subreddit"](https://praw.readthedocs.io/en/stable/getting_started/quick_start.html))
* How do I parse comments from the post? (hint: look at ["Obtain Submission Instances from a Subreddit"](https://praw.readthedocs.io/en/stable/getting_started/quick_start.html))

In [4]:
from praw.models import MoreComments

top_comments = []
#for loop is going through the 
# go through the latest submission limited to 10
for submission in subreddit.top(limit=10):
    # get all top level comments
    for top_level_comment in submission.comments:
        # if top level comment has more comments
        if isinstance(top_level_comment, MoreComments):
                    continue

        top_comments.append(top_level_comment.body)

In [20]:
top_comments

['ho lee fuk \n\nyou got anymore insider information? 👀👀',
 "What will happen if you post that GME it's the new buy target from them? 🤣",
 'When are you all buying $DOGE, and how much will you all buy?',
 'Papa Musk?? 😘😘😘',
 'I really don’t understand what Musk is trying to do. It seems he is trying to legitimize BTC and create a sustainable ecosystem for it. But I question whether Tesla shareholders are going to be happy with such an unplanned use of invested capital. Musk is not the majority of Tesla, and big shareholders are very very picky about where their portion of $1.5bm goes to!',
 "lmk when they start loading up on Doge and I'm in",
 '[deleted]',
 'When is DOGE flying',
 'Are they gonna fire you lol',
 "You're a fucking legend",
 'Give this man a raise! (In BTC)',
 'Do you have twitter or instagram?',
 "Could you point me in the right direction on to how to code one of this bots myself. I'm a developer and have an extensive trade background. I've been interested in trading al

## Part III:  [HuggingFace](https://huggingface.co/docs/transformers/quicktour)
The third part of this exercise is to analyze the sentiment of each comment scraped from `r/TSLA` to using a pre-trained HuggingFace model to make the inference. 

### Your Task
1. Implement the [Sentiment Analysis](https://huggingface.co/docs/transformers/quicktour) Model in the `# YOUR CODE HERE` section. 
2. (Optional) What is the net sentiment of the entire list of comments?

In [24]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
results = classifier(top_comments)
for result in results:
    print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
    


No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)


label: NEGATIVE, with score: 0.9937
label: NEGATIVE, with score: 0.9993
label: NEGATIVE, with score: 0.996
label: NEGATIVE, with score: 0.9925
label: NEGATIVE, with score: 0.9973
label: POSITIVE, with score: 0.98
label: NEGATIVE, with score: 0.9993
label: POSITIVE, with score: 0.6894
label: POSITIVE, with score: 0.9487
label: POSITIVE, with score: 0.998
label: POSITIVE, with score: 0.9963
label: NEGATIVE, with score: 0.9896
label: POSITIVE, with score: 0.5503
label: POSITIVE, with score: 0.974
label: NEGATIVE, with score: 0.9953
label: NEGATIVE, with score: 0.9689
label: POSITIVE, with score: 0.9997
label: NEGATIVE, with score: 0.9992
label: NEGATIVE, with score: 0.9666
label: POSITIVE, with score: 0.999
label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.9994
label: POSITIVE, with score: 0.9977
label: NEGATIVE, with score: 0.9973
label: POSITIVE, with score: 0.9961
label: POSITIVE, with score: 0.9734
label: POSITIVE, with score: 0.9987
label: NEGATIVE, with score: 0.995

In [33]:
import pandas as pd

comments_score_df = pd.DataFrame(results)
median_neg_score = comments_score_df[comments_score_df.label=='NEGATIVE'].median().score
median_pos_score = comments_score_df[comments_score_df.label=='POSITIVE'].median().score
print('median_neg_score:',median_neg_score)
print('median_pos_score:',median_pos_score)

median_neg_score: 0.9956286549568176
median_pos_score: 0.9886464476585388


In [13]:
import random
def get_random_comment(conversations):
    comment = random.choice(conversations)
    return comment

# Run sentiment analysis
sentiment_query_sentence = get_random_comment(top_comments) # grabs a random comment from the comment and replies list
sentiment = classifier(sentiment_query_sentence) # 
print(f"Sentiment test: {sentiment_query_sentence} === {sentiment}")

Sentiment test: Holy shit! I went from up +32k to down 3k to back up 20k! Never sold, just kept adding on the way down! === [{'label': 'NEGATIVE', 'score': 0.9991133809089661}]
