# Daily Social Media Summarizer and Research Agent

### Problem Statement:

The average person uses [6.83 social media platform each month](https://datareportal.com/social-media-users) with the amount of people regularly getting their news from social media sites [continuing to grow](https://www.pewresearch.org/journalism/fact-sheet/social-media-and-news-fact-sheet/).

How do we conslidate, summarize, and filter social media news in a single place to enable users to quickly understand what is happening and ask questions for further clarification, bringing in additional authoritative data sources? 

Additionally, is it possible to filter search results based on sentiment.  For example, can a user filter out negative news stories if they want to see positive news only and can we fine-tune the model to the user's preferences.

### User Persona:

Internet users of all backgrounds who get their news from social media services.

### Proposed Solution

We can build an an application which leverage LLM technology that:

1. Pulls news from subscribed social media sites on a daily basis
3. Classifies the returned news stories based on sentiment
4. Creates embeddings for the news stories and stores it in a vector database with the classification metadata
5. Create a summary for the user of news stories in the past 24 hours meeting, filtering for negative sentiment
6. Allows the user to query for more information on the news stories, while providing grounding when necessary

### In this notebook

I demonstrate this possible solution in this notebook by :

1. Using the Reddit API to pull submissions from the past day
2. Categorizing the returned submissions based on sentiment
3. Creating embeddings for the submissions and store them in a vetor database with their metadata
4. Search a vector database for all positive submissions in the past day which don't have negative sentiment and return an summary of the posts
5. Allow the user to further ask questions.  These questions can span the social media posts in RAG or use grounding

### Concepts Utilized From the Course Include:

1. Prompting - Role prompting, System prompting, one-shot, and few-shot prompting
2. Structured Output
3. Fine Tuning
4. Embeddings
5. RAG
6. Grounding

### Differences / Expansion from the course material:

1. I used an LLM to create synthetic training and test data to fine tune a model.
2. I classify submissions and add this metadata when inserting the document embeddings into the vector database
3. I added extensive metadata for the document embeddings stored in the vector database
4. I perform queries on the vector database using constraints based on the metadata

### FAQ:

#### Why is RAG important in this use case?
I want to provide the user the capability to summarize and search social media submission across all of their social media networks. This enables them to subscribe to content specific to their interests and see this content in one place. 

# Set Up

1. Remove Conflicting Dependencies
2. Install Google Gen AI, Chroma DB, and dependencies
3. Install Rediit SDK

In [1]:
!pip uninstall -qqy jupyterlab google-cloud-automl # Remove conflicting packages

!pip install -U -q "google-genai==1.7.0" "chromadb==0.6.3" "protobuf==3.20.3" "google-api-core==2.16.2" "praw==7.8.1" 

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.2/135.2 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m189.3/189.3 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m62.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m5.5 MB/s[

## Check Set-Up

Check to make sure we are using the correct version of the Google SDK to ensure compatability and the environment is set-up correctly.

In [2]:
from google import genai
from google.genai import types

from IPython.display import Markdown

genai.__version__

'1.7.0'

### Set up the Google API key

To run the following cell, your API key must be stored it in a [Kaggle secret](https://www.kaggle.com/discussions/product-feedback/114053) named `GOOGLE_API_KEY`.

If you don't already have an API key, you can grab one from [AI Studio](https://aistudio.google.com/app/apikey). You can find [detailed instructions in the docs](https://ai.google.dev/gemini-api/docs/api-key).

To make the key available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [3]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

# Fine Tuning

I started by fine-tuning a model to classify the sentiment of a social media post.  To do this, I performed the followign tasks.

1. Finding a gemini model that allows training
2. Creating synthetic dataset of generated posts and sentitments to train and test the model
3. Prepare and sample the data
4. Evaluate baseline performance of the default model
5. Fine-Tune the Model
6. Evaluate performance of the fine-tuned model

However, the baseline performance of the default LLM was near or at 100% in multiple test runs and 35% to 75% for the fine tuned model. The default nodel was better! I have documented this experiment in this [Kaggle notebook](https://www.kaggle.com/code/brbeck/fine-tuning-that-made-things-worse)

For this exercise we will use the default LLM, but in a customer available production version of this application, we can use real world submissions and user classification to fine-tune a model to better align with user sentiment.

## Connect to the Google Gemini API

We will use the default model which supports fine-tuning to query gemini for sentiment for a submission.  As noted above, the default model performed well with success rates as high as 100% in testing.  In a customer available version of a production application, we could use real world submissions and the user's sentiment scoring to fine-tune a model and provide categorization more aligned with a user's sentiment.

In the query below, we structure the output to conform to an enum to return 'positive', 'negative', or 'neutral'

In [4]:
from google.api_core import retry
import enum

client = genai.Client(api_key=GOOGLE_API_KEY)

class Sentiment(enum.Enum):
    POSITIVE = "positive"
    NEUTRAL = "neutral"
    NEGATIVE = "negative"

model_config = types.GenerateContentConfig(
    temperature=0.1,
    top_p=1,
    max_output_tokens=5,
)

zero_shot_prompt = """You are an expert at classifying the sentiment of text written by people and posted on the internet as submissions.  There are 3 classifications: positive, neutral, or negative.  
positive submissions are hopeful, confident, and cover good aspects of a situation.  negative submissions concern war, violence, and crime.  neutral submissions are neither positive or negative. 
Please classify the sentiment of each of the following Reddit submissions as positive, neutral or negative."""

def eval_submission(submission):
    
    prompt = zero_shot_prompt
    prompt += "Submission:" + submission
    
    response = client.models.generate_content(
        #model='gemini-2.0-flash',
        model='gemini-1.5-flash-001',
        config=types.GenerateContentConfig(
            response_mime_type="text/x.enum",
            response_schema=Sentiment
        ),
        contents=prompt)
    
    return response.text
    

# Social Media Post RAG

Now that we have created our fine-tuned model, we can use it on social media posts to classify the sentiment and filter posts for the user. 

In addition, we can create a RAG architecture where we create embeddingd for the posts and store them, along with post metadata into a vector database.

This will allow us summarize the social media posts as intended, but to enable users to search over their post history, filter posts based on sentiment (they may not want to see sad news), date posted or other metadata.

To do this, we will:

1. Create a Reddit API Caller
2. Retrieve social media posts and post metadata from Reddit
3. Use our fine tuned model to classify the sentiment and update the metadata
4. Create the embeddings for the posts and add them and their metadata to the vector database
5. Search the database for all postings in the previous 24 hours with a positive sentiment
6. Use Gemini to create a summary of the posts meeting that criteria

Additionally, in the future, we can allow users to classify the sentiment of their social media posts and use this information to fine-tune a model for them.

### Set Up Reddit API Caller

To create a Reddit API caller account, you must create a new application on reddit [here](https://www.reddit.com/prefs/apps/).

To make the keys available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [5]:
REDDIT_CLIENT_ID = UserSecretsClient().get_secret("REDDIT_CLIENT_ID")
REDDIT_CLIENT_SECRET = UserSecretsClient().get_secret("REDDIT_CLIENT_SECRET")
REDDIT_USERNAME = UserSecretsClient().get_secret("REDDIT_USERNAME")
REDDIT_PASSWORD = UserSecretsClient().get_secret("REDDIT_PASSWORD")

if not all([REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDDIT_USERNAME, REDDIT_PASSWORD]):
    print("Please set REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDDIT_USERNAME, and REDDIT_PASSWORD environment variables")


### Pull the Submissions

For the subreddits a user has subscribed to, we can pull the submissions for each subreddit.

In our example below, we reduce the amount of stories being pulled to accomodate the free tier of API calling.

For each submission we are not only retrieving the body of the submission, but also the metadata.

In [6]:
# Import Reddit interface
import praw

from datetime import datetime
from typing import List, Dict

documents = []
metadatas = []

# The number of subreddits we will return from the API
SUBREDDIT_LIMIT = 10  # None would remove te limit
# The number of submissions we will retrieve from each subreddit
SUBMISSION_LIMIT = 2


# Retrieve the list of subbreddits for a given user
def get_user_subreddits(reddit: praw.Reddit, username: str) -> List[str]:

    user = reddit.redditor(username)
    subreddits = reddit.user.subreddits(limit=SUBREDDIT_LIMIT);

    return list(subreddits)

# Initialize Reddit interface and retrieve auth token
reddit = praw.Reddit(
    client_id=REDDIT_CLIENT_ID,
    client_secret=REDDIT_CLIENT_SECRET,
    username=REDDIT_USERNAME,
    password=REDDIT_PASSWORD,
    user_agent='python:reddit-scraper:v1.0 (by + /u/' + REDDIT_USERNAME + ')',
)

subreddits = get_user_subreddits(reddit, REDDIT_USERNAME)

for subreddit in subreddits:
    #print(f"\nFetching recent posts from r/{subreddit}...")
    
    for submission in subreddit.new(limit=SUBMISSION_LIMIT):
        document = {
            'body': submission.selftext
        }

        metadata = {
            'source': "reddit",
            'title': submission.title,
            'author': str(submission.author),
            'score': submission.score,
            'created_utc': submission.created_utc,
            'url': submission.url,
            'num_comments': submission.num_comments,
            'body': submission.selftext,
            'category': subreddit.name,
            'sentiment': "none"
        }

        #Remove link posts for this exercise. In a more robust implementation we would follow the links, scrape the site and extract the text for embeddings and classification
        if document['body'] != '':
            documents.append(document['body'])
            metadatas.append(metadata)

print("Number of Documents " + str(len(documents)))


# Need to add sample data here and check of the reddit connection is valid or not.  If not, use the sample data.


Number of Documents 12


## Sentiment Classification

Here I will classify the sentiment using the default LLM we can use for fine-tuning.

In [7]:
from google.api_core import retry

# Define a helper to retry when per-minute quota is reached.
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

@retry.Retry(predicate=is_retriable)
def classify_text(text: str) -> str:
    """Classify the sentiment of the provided text"""
    response = client.models.generate_content(
        model=model_id, contents=text)
    rc = response.candidates[0]

    # Any errors, filters, recitation, etc we can mark as a general error
    if rc.finish_reason.name != "STOP":
        return "(error)"
    else:
        return rc.content.parts[0].text


for document, metadata in zip(documents, metadatas):
    metadata['sentiment'] = eval_submission(document)


Let's view the sentiment for the social media submissions. 

If none of the submissions have a positive sentiment, we make them all positive for the purposes of this exercise.

In [8]:
positive_submissions = 0

for document, metadata in zip(documents, metadatas):
    if (metadata['sentiment'] == 'positive'):
        positive_submissions += 1
    print(metadata['sentiment'])

if (positive_submissions == 0):
    print ("NO POSITIVE SUBMISSIONS!! MAKING THEM ALL POSITIVE FOR THIS EXERCISE")
    for document, metadata in zip(documents, metadatas):
        metadata['sentiment'] = 'positive'

neutral
neutral
negative
neutral
neutral
negative
neutral
positive
neutral
positive
neutral
neutral


### Explore available models to calculate embeddings

You will be using the [`embedContent`](https://ai.google.dev/api/embeddings#method:-models.embedcontent) API method to calculate embeddings in this guide. Find a model that supports it through the [`models.list`](https://ai.google.dev/api/models#method:-models.list) endpoint. You can also find more information about the embedding models on [the models page](https://ai.google.dev/gemini-api/docs/models/gemini#text-embedding).

`text-embedding-004` is the most recent generally-available embedding model, so you will use it for this exercise, but try out the experimental `gemini-embedding-exp-03-07` model too.

In [9]:
client = genai.Client(api_key=GOOGLE_API_KEY)

for m in client.models.list():
    if "embedContent" in m.supported_actions:
        print(m.name)

models/embedding-001
models/text-embedding-004
models/gemini-embedding-exp-03-07
models/gemini-embedding-exp


## Creating the embedding database with ChromaDB

I create a [custom function](https://docs.trychroma.com/guides/embeddings#custom-embedding-functions) to generate embeddings with the Gemini API. 


In [10]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from google.api_core import retry

from google.genai import types


# Define a helper to retry when per-minute quota is reached.
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})


class GeminiEmbeddingFunction(EmbeddingFunction):
    # Specify whether to generate embeddings for documents, or queries
    document_mode = True

    @retry.Retry(predicate=is_retriable)
    def __call__(self, input: Documents) -> Embeddings:
        if self.document_mode:
            embedding_task = "retrieval_document"
        else:
            embedding_task = "retrieval_query"

        response = client.models.embed_content(
            model="models/text-embedding-004",
            contents=input,
            config=types.EmbedContentConfig(
                task_type=embedding_task,
            ),
        )
        return [e.values for e in response.embeddings]

In [11]:
import chromadb

DB_NAME = "redditposts"

embed_fn = GeminiEmbeddingFunction()
embed_fn.document_mode = True

chroma_client = chromadb.Client()
db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)

db.add(documents=documents, metadatas=metadatas, ids=[str(i) for i in range(len(documents))])

In [12]:
db.count()
# You can peek at the data too.
# db.peek(1)

12

## Create a Summary

Here I create a summary of all submissions from the past 24 hours that have a positive sentiment.  I do this by using Chroma Metadata filtering which enables us to filter on date and sentiment.

First, we get the list of submission with positive intent in the past 24 hours.

In [13]:
from datetime import timedelta

LOOKBACK_IN_DAYS = 1;

starttime = datetime.now()
endtime = starttime - timedelta(days=LOOKBACK_IN_DAYS)

results_within_timeframe = db.get(where={"$and": [{"created_utc": {"$gt": int(endtime.timestamp())}},{"sentiment": {"$eq": "positive"}}]})

all_passages = results_within_timeframe["documents"]

print("Number of Documents within the previous time period and with positive sentiment " + str(len(results_within_timeframe['documents'])))

#for i in results_within_timeframe['metadatas']:
#    print(i['created_utc'])

Number of Documents within the previous time period and with positive sentiment 2


Now I create the prompt to generate the final summary from the retreived data,

In [14]:
# This prompt is where you can specify any guidance on tone, or what topics the model should stick to, or avoid.
prompt = f"""You are a helpful and informative bot that provides a succint summary of each document provided provided below.
Be sure to respond in a complete sentence, with one bullet point for each document.
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.
"""

# Add the retrieved documents to the prompt.
for passage in all_passages:
    passage_oneline = passage.replace("\n", " ")
    prompt += f"PASSAGE: {passage_oneline}\n"

print(prompt)

You are a helpful and informative bot that provides a succint summary of each document provided provided below.
Be sure to respond in a complete sentence, with one bullet point for each document.
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.
PASSAGE: I messaged the mods but haven’t heard back, so please let me know if this is out of line. Many CoastFIRE calculators are either too simple or overly complex. I created one to fill the gap—a CoastFIRE calculator I wish existed. It not only determines if you’ve reached CoastFIRE but also allows scenario analysis for reduced variable expenses, part-time income (Barista FI), and paying off your mortgage early with a lump sum at retirement age. It includes notifications about being ahead of schedule and how much earlier you could retire. I've also included net worth if you were to co

Finally, we pass the prompt, with retrieved data, to the model to create the social media summary.

In an application setting, we could provide the user will all data filtered by sentiment with the ability to change the sentiment for future RAG retrieval filtering and future fine-tuning.

In [15]:
answer = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt)

Markdown(answer.text)

Okay, I can certainly help you with that! Here's a summary of those passages:

*   One person created a CoastFIRE calculator (CoastFIRE is when you have enough saved that your investments will grow enough to let you retire someday, even if you stop saving!) that is designed to be easy to use, but still detailed, and allows users to do things like plan for reduced expenses or paying off a mortgage.
*   The other passage discusses a project focused on creating a stylish and responsive card layout using HTML and CSS, designed to display information about different cities in an engaging way.


## Search Across with RAG

Once a user sees the summary, they can perform a RAG search which uses embeddings to retrieve relevant information and use it in the prompt for consideration.

The bleow code block searches the vector database for embeddings relevant to the query.

In [16]:
# Switch to query mode when generating embeddings.
embed_fn.document_mode = False

# Search the Chroma DB using the specified query.
query = "What is the latest news today on Google Artificial Intelligence?"

result = db.query(query_texts=[query], n_results=1)
[all_passages] = result["documents"]

Markdown(all_passages[0])

Been looking at them for a while and trying to see if they are worth the investment. Seems they are built and growing despite all the negative publicity about their IPO. 

A prompt is created which incorprates the document sections returned from the vector database search.

In [17]:
query_oneline = query.replace("\n", " ")

# This prompt is where you can specify any guidance on tone, or what topics the model should stick to, or avoid.
prompt = f"""You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: {query_oneline}
"""

# Add the retrieved documents to the prompt.
for passage in all_passages:
    passage_oneline = passage.replace("\n", " ")
    prompt += f"PASSAGE: {passage_oneline}\n"

print(prompt)

You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: What is the latest news today on Google Artificial Intelligence?
PASSAGE: Been looking at them for a while and trying to see if they are worth the investment. Seems they are built and growing despite all the negative publicity about their IPO. 



I then pass that prompt to the LLM to generate a response for the user. If the passages don't contain information helpful to the user, the LLM will respond as such.  However, the user can also search using grounding and incorporate information from a Google search as outlined in the next section.

In [18]:
answer = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt)

Markdown(answer.text)

I am sorry, but the reference text provided is not relevant to the question about Google Artificial Intelligence news. Therefore, I am unable to answer.


## Search with Grounding

The user may want to discover more about a topic from a social media post, but this information may be very recent and unknown to the LLM. We can use grounding via Google search to get the latest information for the topic the user is researching.

In [19]:
config_with_search = types.GenerateContentConfig(
    tools=[types.Tool(google_search=types.GoogleSearch())],
)

def query_with_grounding():
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=query,
        config=config_with_search,
    )
    return response.candidates[0]


rc = query_with_grounding()
Markdown(rc.content.parts[0].text)

Here's the latest on Google AI as of today, April 18, 2025:

**Key Developments & Announcements**

*   **AI-Powered Workflow Automation:** Google is bringing more AI capabilities to Workspace tools (Docs, Sheets, Meet, Chat) and introducing Google Workspace Flows, a new way to automate work across apps using AI.
*   **Gemini in Workspace:** Businesses are using Gemini in Workspace for over 2 billion AI assists monthly to save time and improve results.
*   **Gemma 3:** Google launched Gemma 3, its latest open AI models, to help developers create helpful applications.
*   **DolphinGemma:** Google AI is being used to help decode dolphin communication.
*   **AI for Nature Protection:** Google launched initiatives to protect and restore nature using AI, including a startup accelerator and funding for AI-enabled solutions from Brazilian nonprofits. They also released SpeciesNet, an open-source AI model for identifying animal species from camera trap photos.
*   **Gemini 2.5 Pro:** Google released Gemini 2.5 Pro, their most intelligent AI model.
*   **Gemini Robotics:** Google released Gemini Robotics to help bring AI into the physical world.

**Research Updates**

*   **InstructPipe:** Generating Visual Blocks pipelines with human instructions and LLMs
*   **AI in Biology:** Teaching machines the language of biology for next-generation single-cell analysis.
*   **Geospatial Reasoning:** Using generative AI and multiple foundation models to unlock insights.

# Conclusion

The capabilities demonstrated above show that it is possible to build an application that:

1. Pulls news from subscribed social media sites on a daily basis
3. Classifies the returned news stories based on user sentiment
4. Creates embeddings for the news stories and stores it in a vector database with the classification metadata
5. Create a summary for the user of news stories in the past 24 hours meeting, filtering for negative sentiment
6. Allows the user to query for more information on the news stories, while providing grounding when necessary

# Next Steps

My next goal is to create an agent which performs the actions above as an exercise in agent creation.  From there I would be interested to create a desktop or web application with a deployed MLOPS backend to support the application.


# References

These are sources where I have sourced code, examples, or information to help me with this notebook.

### Google Course Notebooks

1. Day 1 - Prompting
2. Day 2 - Document Q&A with RAG
3. Day 3 - Building an agent with LangGraph
4. Day 4 - Fine tuning a custom model

### Additional References

1. Chroma Metadata Filtering : [link](https://docs.trychroma.com/docs/querying-collections/metadata-filtering)
2. Sentiment Analysis with Keras and LSTM : [link](https://www.kaggle.com/code/roblexnana/sentiment-analysis-with-keras-and-lstm/notebook)
3. Fine-tuning with the Gemini API : [link](https://ai.google.dev/gemini-api/docs/model-tuning)
5. Fine-tuning Tutorial : [link](https://ai.google.dev/gemini-api/docs/model-tuning/tutorial?lang=python)
6. Custom Embedding Functions : [link](https://docs.trychroma.com/docs/embeddings/embedding-functions#custom-embedding-functions)

# Additional Reading

These are sources I referenced, but did not incorporate into this notebook. I note them here as additional reference materials for people who may be interested in social media submission classification in adjacent domains.

1. Financial Sentiment Analysis: Techniques and Application : [link](https://dl.acm.org/doi/10.1145/3649451)
2. Applying Sentiment Analysis Techniques in Social Media Data Threat of Armed Conflicts Using Two Times Series Models : [link](https://www.researchgate.net/publication/367142985_Applying_Sentiment_Analysis_Techniques_in_Social_Media_Data_About_Threat_of_Armed_Conflicts_Using_Two_Times_Series_Models)
3. Threat detection in online discussion using convolutional neural network : [link](https://www.duo.uio.no/bitstream/handle/10852/59278/5/Thesis_Stenberg.pdf)