# Python / AI Stock Screening and Sentiment Analysis

Looking for promising stocks to invest in can take a lot of time. What if we could use artificial intelligence (A.I.) and Python to make this process faster? In this article, I'll show you how to use the finvizfinance Python library to find stocks that might be undervalued. Then, I'll explain a simple way to analyze the sentiment (feelings) about these stocks using FinBERT, a pre-trained natural language processing (NLP) model.

Let's get started!

First, you'll need to install some libraries and bring them into your project. finviz.com is a website that provides tools for analyzing stocks, including a free stock screener. Here, we'll use the finvizfinance library to fetch information from the screener's 'Overview' section and store it in a Pandas data frame. This will help us quickly see the results of our stock analysis.

In [None]:
pip install finvizfinance



# Install the libary finvizfinance

In [None]:
from finvizfinance.screener.overview import Overview

This class is part of the finvizfinance library and is typically used to access the Finviz screener's overview functionality. With this class, you can retrieve information about stocks based on specified filters, making it easier to analyze and identify potential investment opportunities.

In [None]:
import pandas as pd
import csv
import os

You've imported the pandas library and the csv and os modules. Here's a brief explanation of each:

pandas as pd: This alias pd is commonly used for the pandas library. pandas is a powerful data manipulation and analysis library in Python, and it provides data structures like DataFrames that are useful for handling and analyzing tabular data.

csv: This module provides functionality to work with CSV (Comma-Separated Values) files. It allows you to read from and write to CSV files, which are commonly used for storing tabular data.

os: This module provides a way to interact with the operating system. It includes functions to perform various operations related to the file system, such as creating directories (os.makedirs()), checking if a file or directory exists (os.path.exists()), and more.

In [None]:
def get_undervalued_stocks():
    """
    Returns a list of tickers with:

    - Positive Operating Margin
    - Debt-to-Equity ratio under 1
    - Low P/B (under 1)
    - Low P/E ratio (under 15)
    - Low PEG ratio (under 1)
    - Positive Insider Transactions
    - Market Cap.': 'Small ($300mln to $2bln)'
    """
    foverview = Overview()

# undervalued_stocks function
The function get_undervalued_stocks() aims to return a list of stock tickers that meet specific criteria indicating potentially undervalued stocks. Let's break down each criterion:

# Positive Operating Margin:

An operating margin is a profitability ratio that measures the percentage of revenue remaining after covering operating expenses. A positive operating margin suggests that a company is generating profit from its core operations.

# Debt-to-Equity Ratio under 1:

The debt-to-equity ratio is a financial metric that compares a company's total debt to its shareholders' equity. A ratio under 1 indicates that the company has less debt relative to its equity, which can be considered a positive financial sign.

# Low Price-to-Book (P/B) Ratio (under 1):

The price-to-book ratio compares a company's market value to its book value (assets minus liabilities). A ratio under 1 suggests that the stock may be undervalued relative to its book value.

# Low Price-to-Earnings (P/E) Ratio (under 15):

The price-to-earnings ratio compares a company's current stock price to its earnings per share. A low P/E ratio relative to industry peers or historical values may indicate an undervalued stock.

# Low Price/Earnings to Growth (PEG) Ratio (under 1):

The PEG ratio adjusts the traditional P/E ratio by factoring in the expected earnings growth rate. A PEG ratio under 1 is often considered favorable, indicating that the stock may be undervalued relative to its expected growth.

# Positive Insider Transactions:

Positive insider transactions indicate that company insiders (such as executives or employees) are buying shares of the company, which could be seen as a positive sign.

# Market Cap in the Range of 'Small' ($300 million to $2 billion):

Market capitalization (market cap) is the total market value of a company's outstanding shares of stock. 'Small' market cap, in this context, falls within the range of $300 million to $2 billion. Companies with smaller market caps may have more growth potential.
It's important to note that these criteria are commonly used in fundamental analysis to identify potentially undervalued stocks. The Overview() object from the finvizfinance library is likely used to retrieve stock information and filter stocks based on these criteria.

In [None]:
def get_undervalued_stocks():
    foverview = Overview()

    filters_dict = {'Debt/Equity': 'Under 1',
                    'PEG': 'Low (<1)',
                    'Operating Margin': 'Positive (>0%)',
                    'P/B': 'Low (<1)',
                    'P/E': 'Low (<15)',
                    'InsiderTransactions': 'Positive (>0%)',
                    'Market Cap.': 'Small ($300mln to $2bln)'} # "Nano Cap ($50mln)", "Micro Cap $50mln to $300mln", "Mid Cap ($20mln to $10mln), "

    parameters = ['Exchange', 'Index', 'Sector', 'Industry', 'Country', 'Market Cap.',
        'P/E', 'Forward P/E', 'PEG', 'P/S', 'P/B', 'Price/Cash', 'Price/Free Cash Flow',
        'EPS growththis year', 'EPS growthnext year', 'EPS growthpast 5 years', 'EPS growthnext 5 years',
        'Sales growthpast 5 years', 'EPS growthqtr over qtr', 'Sales growthqtr over qtr',
        'Dividend Yield', 'Return on Assets', 'Return on Equity', 'Return on Investment',
        'Current Ratio', 'Quick Ratio', 'LT Debt/Equity', 'Debt/Equity', 'Gross Margin',
        'Operating Margin', 'Net Profit Margin', 'Payout Ratio', 'InsiderOwnership', 'InsiderTransactions',
        'InstitutionalOwnership', 'InstitutionalTransactions', 'Float Short', 'Analyst Recom.',
        'Option/Short', 'Earnings Date', 'Performance', 'Performance 2', 'Volatility', 'RSI (14)',
        'Gap', '20-Day Simple Moving Average', '50-Day Simple Moving Average',
        '200-Day Simple Moving Average', 'Change', 'Change from Open', '20-Day High/Low',
        '50-Day High/Low', '52-Week High/Low', 'Pattern', 'Candlestick', 'Beta',
        'Average True Range', 'Average Volume', 'Relative Volume', 'Current Volume',
        'Price', 'Target Price', 'IPO Date', 'Shares Outstanding', 'Float']

    foverview.set_filter(filters_dict=filters_dict)
    df_overview = foverview.screener_view()

    if not os.path.exists('out'):  # ensures you have an 'out' folder ready
        os.makedirs('out')

    df_overview.to_csv('out/Overview.csv', index=False)
    tickers = df_overview['Ticker'].to_list()
    return tickers

print(get_undervalued_stocks())


[Info] loading page [##############################] 1/1 ['BOOM', 'HAFC', 'HTLF', 'OCFC']


# undervalued_stocks

The code defines a Python function, get_undervalued_stocks, that uses the Finviz screener to fetch a list of potentially undervalued stocks based on specific criteria. The criteria include positive operating margin, a debt-to-equity ratio under 1, low P/B and P/E ratios, a low PEG ratio, positive insider transactions, and a market capitalization in the range of 'Small' ($300 million to $2 billion).

Sets filters for the Finviz screener based on the defined criteria.
Defines a list of parameters for stock information retrieval.h
Applies the filters to the screener and retrieves data into a Pandas DataFrame.
Checks if the 'out' directory exists and creates it if not.
Saves the DataFrame to a CSV file ('Overview.csv') inside the 'out' directory.
Extracts the 'Ticker' column from the DataFrame and returns a list of tickers.
Prints the list of tickers returned by the function.
This code helps automate the process of identifying undervalued stocks by applying specific filters and saving the results for further analysis.

so the result of the code provide four undervalued stock using using the provide parameters which are ['BOOM', 'HAFC', 'HTLF', 'OCFC']

  The full meaning are

BOOM: DMC Global Inc.

HAFC: Hanmi Financial Corporation.

HTLF: Heartland Financial USA, Inc.

OCFC: OceanFirst Financial Corp.

fill free to play around with the parameters e.g it can be industry specific ('techonolgy' or ' real estate') that suite your need.

In [None]:
from transformers import pipeline

# transformers library

The line from transformers import pipeline imports the pipeline module from the transformers library. The transformers library is a popular open-source library provided by Hugging Face, designed for natural language processing (NLP) and machine learning tasks.

In this context, the pipeline module is used to create a simple interface for various pre-trained models, making it easier to perform different NLP tasks without needing to manually load and configure models. The pipeline function allows you to create pipelines for tasks such as text classification, named entity recognition, sentiment analysis, and more, using pre-trained models from the Hugging Face model hub.

For example, you might use pipeline("text-classification", model="ProsusAI/finbert") to create a pipeline for text classification using the FinBERT model.

In [None]:
import yfinance as yf

# yfinance library
The line import yfinance as yf imports the yfinance library and assigns it the alias yf. yfinance is a popular Python library that provides a simple interface to download financial data from Yahoo Finance.

In [None]:
pip install goose3



In [None]:
from goose3 import Goose
from requests import get

# goose3 library

goose3 is a Python library for extracting content from web pages (web scraping), and it's often used for extracting article content from news websites.

The requests library is a popular Python library for making HTTP requests. The get function, in particular, is commonly used to send HTTP GET requests to retrieve information from a specified URL.

In [None]:
def get_ticker_news_sentiment(tickers):
    data = []

    for ticker in tickers:
        try:
            ticker_news = yf.Ticker(ticker)
            news_list = ticker_news.get_news()

            if not news_list:
                print(f"No news found for {ticker}")
                continue

            extractor = Goose()
            pipe = pipeline("text-classification", model="ProsusAI/finbert")

            for dic in news_list:
                title = dic['title']
                response = get(dic['link'])
                article = extractor.extract(raw_html=response.content)
                text = article.cleaned_text
                date = article.publish_date

                if len(text) > 512:
                    data.append({'Ticker': ticker,
                                 'Date': f'{date}',
                                 'Article title': f'{title}',
                                 'Article sentiment': 'NaN too long'})
                else:
                    results = pipe(text)
                    data.append({'Ticker': ticker,
                                 'Date': f'{date}',
                                 'Article title': f'{title}',
                                 'Article sentiment': results[0]['label']})

        except Exception as e:
            print(f"Error processing {ticker}: {e}")

    df = pd.DataFrame(data)
    return df

# Example usage
ticker_sentiment_df = get_ticker_news_sentiment((get_undervalued_stocks()))
print(ticker_sentiment_df)

[Info] loading page [##############################] 1/1 

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


   Ticker                      Date  \
0    BOOM  2023-12-24T14:59:38.000Z   
1    BOOM  2023-12-03T12:57:03.000Z   
2    BOOM                      None   
3    BOOM                      None   
4    BOOM                      None   
5    BOOM                      None   
6    BOOM                      None   
7    BOOM                      None   
8    HAFC  2023-12-06T14:00:00.000Z   
9    HAFC                      None   
10   HAFC                      None   
11   HAFC                      None   
12   HAFC                      None   
13   HAFC                      None   
14   HAFC                      None   
15   HAFC                      None   
16   HTLF  2023-12-13T14:51:05.000Z   
17   HTLF                      None   
18   HTLF                      None   
19   HTLF                      None   
20   HTLF                      None   
21   HTLF                      None   
22   HTLF                      None   
23   HTLF                      None   
24   OCFC  2023-12-04T11:

The provided code defines a function get_ticker_news_sentiment that takes a list of stock tickers as input, retrieves news articles for each ticker using Yahoo Finance (yfinance library), performs sentiment analysis on the article text using the FinBERT model from Hugging Face (transformers library), and then constructs a Pandas DataFrame with the extracted information.

Here's a summary of the code:

# Function Definition:

Name: get_ticker_news_sentiment
Parameters: tickers (a list of stock tickers)

# Process for Each Ticker:

For each ticker in the input list:
Retrieve news articles for the current ticker using Yahoo Finance (yf.Ticker(ticker).get_news()).
If no news articles are found, print a message and move to the next ticker.
Use the Goose library to extract information (title, text, date) from each news article.
Perform sentiment analysis on the article text using the FinBERT model (pipeline("text-classification", model="ProsusAI/finbert")).
Construct a dictionary with the ticker symbol, article date, title, and sentiment, and append it to the data list.
If the article text is too long (exceeds 512 characters), set the sentiment as 'NaN too long'.
Handle exceptions and print error messages if encountered.

# DataFrame Construction:

Create a Pandas DataFrame (df) from the accumulated data list.
Return the DataFrame.

# Example Usage:

Call get_ticker_news_sentiment with a list of undervalued stocks obtained from the get_undervalued_stocks function.
Print the resulting DataFrame.
The code effectively combines financial data retrieval using Yahoo Finance with natural language processing techniques to analyze news sentiment for a given list of stock tickers. It handles exceptions gracefully and provides an example of usage with undervalued stocks.

In [None]:
def generate_csv(ticker):
    get_ticker_news_sentiment(ticker).to_csv(f'out/{ticker}.csv', index=False)

The function generate_csv takes a stock ticker as input, uses the get_ticker_news_sentiment function to retrieve sentiment information for news articles related to that ticker, and then saves the result as a CSV file in the 'out' directory with the filename as '{ticker}.csv'.

In [None]:
undervalued = get_undervalued_stocks()
for ticker in undervalued:
    generate_csv(ticker)

The provided code block retrieves a list of undervalued stocks using the get_undervalued_stocks function and then generates individual CSV files for sentiment analysis results for each stock in the list using the generate_csv function