<a href="https://colab.research.google.com/github/AlphaZero3d/NewsScrapeAndSentimentAnalyzer/blob/main/News_Sentiment_Analyzer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#News Sentiment Analyzer

Copyright (c) 2023, Alphaero3d@gmail.com
All rights reserved.

This source code is licensed under the BSD-style license found in the
LICENSE file in the root directory of this source tree.

##Install requirements

In [None]:
!pip install requests beautifulsoup4 nltk lxml
import nltk
nltk.download('vader_lexicon')

##Run Program

In [20]:
import nltk
#nltk.download('vader_lexicon')
import requests
from bs4 import BeautifulSoup
from nltk.sentiment import SentimentIntensityAnalyzer


def scrape_news(url):
  """Scrapes the top headlines from a news website."""
  response = requests.get(url)
  soup = BeautifulSoup(response.content, "html.parser")
  news_list = soup.find_all("item")
  return news_list

def analyze_sentiment(news_list):
  """Analyzes the sentiment of the top headlines from a news website."""
  analyzer = SentimentIntensityAnalyzer()
  sentiments = []
  for news in news_list:
    title_sentiment = analyzer.polarity_scores(news.title.text)

    description_sentiment = analyzer.polarity_scores(news.description.text)
    sentiments.append((title_sentiment, description_sentiment))
  return sentiments

def sum_sentiment_scores(sentiments):
  """Sums the values in the 'neg', 'neu', and 'pos' columns of the sentiment scores."""
  neg_sum = 0
  neu_sum = 0
  pos_sum = 0
  for title_sentiment, description_sentiment in sentiments:
    neg_sum += title_sentiment['neg'] + description_sentiment['neg']
    neu_sum += title_sentiment['neu'] + description_sentiment['neu']
    pos_sum += title_sentiment['pos'] + description_sentiment['pos']
  return neg_sum, neu_sum, pos_sum

#userterm = input()

if __name__ == "__main__":
  url  = "https://news.google.com/news/rss"
  #url = "https://news.google.com/rss/search?q=stocks&hl=en-US&gl=US&ceid=US%3Aen"
  #url = "https://news.google.com/rss/search?q=bonds&hl=en-US&gl=US&ceid=US%3Aen"
  #url = "https://news.google.com/rss/search?q=earnings&hl=en-US&gl=US&ceid=US%3Aen"
  #url = "https://news.google.com/rss/search?q=netflix&hl=en-US&gl=US&ceid=US%3Aen"
  #url = "https://www.npr.org/sections/news/"
  #url = "http://feeds.bbci.co.uk/news/rss.xml"
  #url = "https://www.ft.com/news-feed/rss.xml"
  #url = "https://www.nasdaq.com/feed/rssoutbound?category=Stocks/rss.xml"
  #url = "https://news.google.com/rss/search?q=when:24h+allinurl:bloomberg.com&hl=en-US&gl=US&ceid=US:en"
  #url = "https://news.google.com/rss/search?gl=US&hl=en-US&q=Tesla,+Inc.&ceid=US:en"
  #url = "https://news.google.com/rss/search?q=when:24h+allinurl:bloomberg.com"
  #url = "https://rss.nytimes.com/services/xml/rss/nyt/MostViewed.xml"
  news_list = scrape_news(url)
  sentiments = analyze_sentiment(news_list)
  neg_sum, neu_sum, pos_sum = sum_sentiment_scores(sentiments)
  print("Negative sentiment:", neg_sum)
  print("Neutral sentiment:", neu_sum)
  print("Positive sentiment:", pos_sum)
  print("Positive - Negative Difference: ",pos_sum-neg_sum)


Negative sentiment: 7.061999999999999
Neutral sentiment: 63.803
Positive sentiment: 5.131
Positive - Negative Difference:  -1.9309999999999992


##Notes:
Using google News we can change the "search" by editing the term that follows this line of code
```
search?q=
```
our new query will be:
```
search?q= NEW_TERM_HERE
```
the final string will look like this:
```
"https://news.google.com/rss/search?q=NEW_TERM_HERE&hl=en-US&gl=US&ceid=US%3Aen"
```

We could also create an updated code that takes user input to automate the html editing process, for instyance if we want to seach AAPL or dogs we can either edit the  

```
url = "https://news.google.com/news/rss/search?q=AAPL"
```

Alternativley we could take user input like this:

```
if __name__ == "__main__":
  userterm = input()
  url = "https://news.google.com/news/rss/search?q={userterm}s&hl=en-US&gl=US&ceid=US%3Aen"
  news_list = scrape_news(url)
  sentiments = analyze_sentiment(news_list)
  neg_sum, neu_sum, pos_sum = sum_sentiment_scores(sentiments)
  print("Negative sentiment:", neg_sum)
  print("Neutral sentiment:", neu_sum)
  print("Positive sentiment:", pos_sum)
  print("Positive - Negative Difference: ",pos_sum-neg_sum)

```



#The New Code With User input:

In [21]:
import nltk
nltk.download('vader_lexicon')
import requests
from bs4 import BeautifulSoup
from nltk.sentiment import SentimentIntensityAnalyzer


def scrape_news(url):
  """Scrapes the top headlines from a news website."""
  response = requests.get(url)
  soup = BeautifulSoup(response.content, "html.parser")
  news_list = soup.find_all("item")
  return news_list

def analyze_sentiment(news_list):
  """Analyzes the sentiment of the top headlines from a news website."""
  analyzer = SentimentIntensityAnalyzer()
  sentiments = []
  for news in news_list:
    title_sentiment = analyzer.polarity_scores(news.title.text)

    description_sentiment = analyzer.polarity_scores(news.description.text)
    sentiments.append((title_sentiment, description_sentiment))
  return sentiments

def sum_sentiment_scores(sentiments):
  """Sums the values in the 'neg', 'neu', and 'pos' columns of the sentiment scores."""
  neg_sum = 0
  neu_sum = 0
  pos_sum = 0
  for title_sentiment, description_sentiment in sentiments:
    neg_sum += title_sentiment['neg'] + description_sentiment['neg']
    neu_sum += title_sentiment['neu'] + description_sentiment['neu']
    pos_sum += title_sentiment['pos'] + description_sentiment['pos']
  return neg_sum, neu_sum, pos_sum

if __name__ == "__main__":
  userterm = input()
  url = "https://news.google.com/news/rss/search?q={userterm}s&hl=en-US&gl=US&ceid=US%3Aen"
  news_list = scrape_news(url)
  sentiments = analyze_sentiment(news_list)
  neg_sum, neu_sum, pos_sum = sum_sentiment_scores(sentiments)
  print("Negative sentiment:", neg_sum)
  print("Neutral sentiment:", neu_sum)
  print("Positive sentiment:", pos_sum)
  print("Positive - Negative Difference: ",pos_sum-neg_sum)


war
Negative sentiment: 1.8489999999999998
Neutral sentiment: 17.71
Positive sentiment: 0.44199999999999995
Positive - Negative Difference:  -1.4069999999999998


##Creating the webapp


Creating a web app that isn't self-hosted using Google Colab involves creating a web interface that interacts with your code running in Colab. Keep in mind that Colab is primarily designed for data analysis and machine learning tasks, and it's not meant to be a full-fledged web hosting platform. However, you can use tools like Flask and ngrok to create a temporary web interface for your script. Here's a basic outline of how you can achieve this:

1: Use Flask to Create a Web Interface:
Flask is a lightweight web framework for Python. You can use Flask to create a simple web interface that allows users to input their search term and displays the sentiment analysis results.

Install Flask by running the following command in a code cell in your Colab notebook:

In [27]:
!pip install flask




Then, create a new code cell and write a Flask app:

In [28]:
from flask import Flask, request, render_template
app = Flask(__name__)

@app.route("/", methods=["GET", "POST"])
def index():
    if request.method == "POST":
        userterm = request.form["userterm"]
        # ... Run your sentiment analysis code here ...
        # Return the sentiment analysis results as HTML
        return f"Sentiment analysis results for '{userterm}': ..."
    return render_template("index.html")

if __name__ == "__main__":
    app.run(debug=True)


 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug: * Restarting with stat


2: Create an HTML Template:
You can create an HTML template named "index.html" in the same directory as your notebook. This template will provide the user interface for the web app. Here's a simple example:

```
<!DOCTYPE html>
<html>
<head>
    <title>Sentiment Analysis Web App</title>
</head>
<body>
    <h1>Sentiment Analysis Web App</h1>
    <form method="POST" action="/">
        <label for="userterm">Enter a search term:</label>
        <input type="text" name="userterm" required>
        <button type="submit">Analyze</button>
    </form>
</body>
</html>

```

3: Expose the Web App with ngrok:
ngrok is a tool that creates a secure tunnel from a public endpoint to a locally running web service. This allows you to temporarily expose your Colab-based Flask app to the internet.

Install ngrok by running this command in a code cell:

In [None]:
!pip install pyngrok

Then, create a new code cell to run ngrok:

In [None]:
from pyngrok import ngrok

# Set up the tunnel to your local Flask app
public_url = ngrok.connect(port=5000)
print("Public URL:", public_url)

The public URL provided by ngrok is what you can share with others to access your web app temporarily.


Please note that this approach is temporary and not suitable for hosting production-level web apps. The ngrok tunnel will expire after a certain period, and Colab environments are meant for interactive sessions, not for long-running web apps. For a more robust and reliable solution, you should consider using proper web hosting platforms and frameworks.

##Production-level web application

For a production-level web application, you should move away from Google Colab and consider using a dedicated web hosting service, a cloud provider, or a server you control. Here's a high-level outline of the steps you would take to deploy a sentiment analysis web application:

    Select a Hosting Provider:
    Choose a reliable hosting provider or cloud platform that offers web hosting services. Popular options include AWS (Amazon Web Services), Google Cloud Platform, Microsoft Azure, DigitalOcean, Heroku, and more.

    Prepare Your Code:
    Organize your code into a directory structure suitable for a production environment. Make sure your code follows best practices and is optimized for performance and security.

    Choose a Web Framework:
    Select a web framework for your application. Flask, Django, and FastAPI are popular choices for Python web development.

    Set Up a Production Server:
    Deploy your application on a production server. This involves setting up a web server (like Nginx or Apache) to serve your application, configuring a WSGI (Web Server Gateway Interface) server to handle requests, and creating a process manager (like Gunicorn or uWSGI) to manage your application's processes.

    Database Integration (Optional):
    If your application requires data storage, set up a production-ready database. MySQL, PostgreSQL, MongoDB, and others are common choices.

    Security Measures:
    Implement security best practices, such as HTTPS using SSL/TLS certificates, input validation, user authentication, and authorization. Regularly update your software to patch security vulnerabilities.

    Monitoring and Logging:
    Set up monitoring tools to track the performance, uptime, and usage of your application. Configure logging to capture errors and events for debugging.

    Domain Name and DNS Configuration:
    Obtain a domain name for your application and configure DNS settings to point to your server's IP address.

    Scaling (if needed):
    As your application grows, you might need to scale it horizontally (adding more servers) or vertically (increasing server resources).

    Continuous Integration and Deployment (CI/CD):
    Implement a CI/CD pipeline to automate deployment and updates. Tools like Jenkins, Travis CI, CircleCI, and GitHub Actions can help streamline this process.

    Backup and Disaster Recovery:
    Set up regular backups and implement disaster recovery plans to ensure data integrity and availability.

    Legal and Compliance:
    Ensure your application complies with relevant legal regulations, such as data protection laws (like GDPR) and copyright restrictions.

Deploying a production-level web application involves a complex set of tasks and considerations. It's recommended to thoroughly research each step and seek guidance from experienced developers if you're new to deploying applications in a production environment.
