# 1. Scraping stock news headline from Finviz

### What is FINVIZ?

FINVIZ (Financial Visualization) is a popular and widely used financial website (www.finviz.com) that provides various tools and information for investors and traders. It offers a range of features that help users analyze and track financial markets, stocks, commodities, currencies, and other financial instruments.

In this era which the financial news happens in seconds. We can leverage this data to gain an edge in investment planning. 

#### Stocks to watch in electric vehicle sectors.

**Tesla, Inc. (TSLA)**

![TESLA](https://thumbor.forbes.com/thumbor/fit-in/600x300/https://www.forbes.com/advisor/wp-content/uploads/2022/09/Tesla_Inc__TSLA_-removebg-preview.png)


Tesla is the biggest name in EVs. It has also become the largest car company in the world by market cap, traditional or electric—not to mention the eighth largest public company on the planet by capitalization.

The key for Tesla? The company makes a profit while most other stocks in this sector do not, netting earnings per share (EPS) of $3.62 in 2022.

One caveat for investors is that Tesla is more than a car company. Its business touches many areas, including batteries, robotics and solar panels.

**NIO Inc. (NIO)**

![NIO](https://thumbor.forbes.com/thumbor/fit-in/600x300/https://www.forbes.com/advisor/wp-content/uploads/2022/09/NIO_Inc.__NIO_-removebg-preview.png)

Like Tesla, Nio is another one of the biggest names in the space—but the comparison to Tesla ends there. Nio is a Chinese startup, meaning investors are open to a whole new array of geopolitical risks.

NIO is also far smaller: Tesla delivered more than 400,000 vehicles in Q4 of 2022, ten times Nio’s quarterly deliveries.

But when you compare valuations, you can see the value. Nio trades at a significantly lower price/sales ratio, and the discount has widened this year. Nio’s returns have been relatively subdued, while Tesla has nearly doubled.

**Li Auto Inc. (LI)**

![LI](https://thumbor.forbes.com/thumbor/fit-in/600x300/https://www.forbes.com/advisor/wp-content/uploads/2022/09/Li_Auto_Inc.__LI_-removebg-preview.png)

Li Auto is a Chinese EV maker whose shares fared surprisingly well in 2022, falling only 5percent when much of the rest of the market got crushed.

The EV automaker also recently announced Q1 2022 deliveries of 52,584 units, representing an impressive 66percents year-over-year gain. For full-year 2022, Li delivered 133,246 units, up 47percents year over year.

Li’s vehicles are all luxury SUVs that are targeted at high-end consumers. In the world of start-up EV automakers, revenue is a key focus. For Li Auto, its 2022 revenue gains stand out: Up nearly 68percents year over year.

**XPeng Inc. (XPEV)**

![XPENG](https://thumbor.forbes.com/thumbor/fit-in/600x300/https://www.forbes.com/advisor/wp-content/uploads/2022/09/XPeng_Inc.__XPEV_-removebg-preview.png)

Xpeng is another Chinese EV manufacturer. The company makes SUVs, sport sedans and family sedans. The company delivered 120,757 vehicles in 2022, up 23percents year over year. Revenue climbed 28percents over the previous year.

XPEV’s vehicles are targeted at “tech-savvy middle-class consumers,” according to the company. For investors banking on China’s economic growth continuing apace, this represents a tantalizing alternative for EVs market exposure.

**Rivian Automotive (RIVN)**

![RIVN](https://companiesmarketcap.com/img/company-logos/256/RIVN.webp)

an electric car manufacturer that has taken an innovative approach to vehicle design. Rivian has developed a ‘skateboard’ platform, a wheeled chassis powered by four electric motors – one at each wheel. This versatile chassis incorporates pre-installed fittings for various battery systems and can accommodate different body styles and functions. As a result, it’s a modular EV that offers maximum flexibility and can easily adapt to a range of vehicle types. Rivian’s platform is designed from the ground up to provide maximum flexibility.

**Lucid Group Inc. (LCID)**

![LCID](https://thumbor.forbes.com/thumbor/fit-in/600x300/https://www.forbes.com/advisor/wp-content/uploads/2022/09/Lucid-Group-Inc..png)

Lucid is an American EV maker known for producing the fastest-charging car on the market, the Lucid Air.

Lucid manufactures luxury EVs, but it delivered only 4,369 vehicles last year, blaming supply chain and production issues for its diminished performance. It’s also worth noting that Saudi Arabia’s public investment fund owns a 62percents stake in the company.



In [1]:
import pandas as pd
import numpy as np

from bs4 import BeautifulSoup
from urllib.request import urlopen
from urllib.request import Request
from transformers import AutoTokenizer, AutoModelForSequenceClassification


In [2]:
# Extract TESLA Stock headline news
stock = 'TSLA'
news = {}

url = f'https://finviz.com/quote.ashx?t=TSLA&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

TSLA = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
TSLA

Unnamed: 0,stock,date-time,headline,source
0,TSLA,\r\n Aug-23-23 09:07AM\r\n,"Polestar Upgrades Model, Hits EV Production Mi...",(Barrons.com)
1,TSLA,\r\n 09:04AM\r\n,15 Highest Paying Countries for Engineers,(Insider Monkey)
2,TSLA,\r\n 06:16AM\r\n,Tesla's German plant lowers production target ...,(Reuters)
3,TSLA,\r\n 06:10AM\r\n,"Down 9% in the Past 5 Days, Is Now the Right T...",(Motley Fool)
4,TSLA,\r\n 06:04AM\r\n,Tesla's German plant lowers production target ...,(Reuters)
...,...,...,...,...
95,TSLA,\r\n 05:12PM\r\n,Ford CEO Jim Farley Test Drives an F-150 Light...,(Observer)
96,TSLA,\r\n 04:07PM\r\n,NYC To Boston After 10-Minute Charge? Tesla Su...,(Investor's Business Daily)
97,TSLA,\r\n 04:06PM\r\n,Tesla Extends Slide On Latest China EV Price W...,(Investor's Business Daily)
98,TSLA,\r\n 04:00PM\r\n,Mark Cuban takes another jab at Elon Musk' s b...,(TheStreet.com)


In [3]:
TSLA['date-time'] = TSLA['date-time'].str.replace(r'\r\n', '')

TSLA['date-time'] = TSLA['date-time'].str.strip()
TSLA['date-time'] = pd.to_datetime(TSLA['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

TSLA['date-time'] = TSLA['date-time'].fillna(method='ffill')

TSLA['date'] = TSLA['date-time'].dt.strftime('%b-%d-%y')
TSLA['time'] = TSLA['date-time'].dt.strftime('%I:%M%p')

TSLA = TSLA.drop(columns=['date-time'])

TSLA

  TSLA['date-time'] = TSLA['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,TSLA,"Polestar Upgrades Model, Hits EV Production Mi...",(Barrons.com),Aug-23-23,09:07AM
1,TSLA,15 Highest Paying Countries for Engineers,(Insider Monkey),Aug-23-23,09:07AM
2,TSLA,Tesla's German plant lowers production target ...,(Reuters),Aug-23-23,09:07AM
3,TSLA,"Down 9% in the Past 5 Days, Is Now the Right T...",(Motley Fool),Aug-23-23,09:07AM
4,TSLA,Tesla's German plant lowers production target ...,(Reuters),Aug-23-23,09:07AM
...,...,...,...,...,...
95,TSLA,Ford CEO Jim Farley Test Drives an F-150 Light...,(Observer),Aug-16-23,08:00PM
96,TSLA,NYC To Boston After 10-Minute Charge? Tesla Su...,(Investor's Business Daily),Aug-16-23,08:00PM
97,TSLA,Tesla Extends Slide On Latest China EV Price W...,(Investor's Business Daily),Aug-16-23,08:00PM
98,TSLA,Mark Cuban takes another jab at Elon Musk' s b...,(TheStreet.com),Aug-16-23,08:00PM


In [4]:
TSLA.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [5]:
# Extract Nio Stock headline news
stock = 'NIO'
news = {}

url = f'https://finviz.com/quote.ashx?t=NIO&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

NIO = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
NIO


Unnamed: 0,stock,date-time,headline,source
0,NIO,\r\n Aug-22-23 07:01PM\r\n,Graphite Wars: The Trillion Dollar Battery Rac...,(Oilprice.com)
1,NIO,\r\n Aug-21-23 08:49AM\r\n,Global EV Market Share by Company: Top 15 Comp...,(Insider Monkey)
2,NIO,\r\n Aug-17-23 05:45PM\r\n,NIO Inc. (NIO) Gains As Market Dips: What You ...,(Zacks)
3,NIO,\r\n 01:33PM\r\n,VinFast Stock Falls for Second Day; Chinese EV...,(The Wall Street Journal)
4,NIO,\r\n Aug-16-23 09:52AM\r\n,VinFast Stock Plunges After EV Maker's Blockbu...,(The Wall Street Journal)
...,...,...,...,...
95,NIO,\r\n Jun-12-23 04:57PM\r\n,Best Stocks to Buy Now: Is Nio Stock a Buy Aft...,(Motley Fool)
96,NIO,\r\n 02:01PM\r\n,Options Traders Score Cheap Premium After Nio ...,(Schaeffer's Investment Research)
97,NIO,\r\n 01:33PM\r\n,"NIO Q1 Loss Wider Than Expected, Vehicle Margi...",(Zacks)
98,NIO,\r\n 11:16AM\r\n,Why Nio Stock Shot Higher Today,(Motley Fool)


In [6]:
NIO['date-time'] = NIO['date-time'].str.replace(r'\r\n', '')

NIO['date-time'] = NIO['date-time'].str.strip()
NIO['date-time'] = pd.to_datetime(NIO['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

NIO['date-time'] = NIO['date-time'].fillna(method='ffill')

NIO['date'] = NIO['date-time'].dt.strftime('%b-%d-%y')
NIO['time'] = NIO['date-time'].dt.strftime('%I:%M%p')

NIO = NIO.drop(columns=['date-time'])

NIO

  NIO['date-time'] = NIO['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,NIO,Graphite Wars: The Trillion Dollar Battery Rac...,(Oilprice.com),Aug-22-23,07:01PM
1,NIO,Global EV Market Share by Company: Top 15 Comp...,(Insider Monkey),Aug-21-23,08:49AM
2,NIO,NIO Inc. (NIO) Gains As Market Dips: What You ...,(Zacks),Aug-17-23,05:45PM
3,NIO,VinFast Stock Falls for Second Day; Chinese EV...,(The Wall Street Journal),Aug-17-23,05:45PM
4,NIO,VinFast Stock Plunges After EV Maker's Blockbu...,(The Wall Street Journal),Aug-16-23,09:52AM
...,...,...,...,...,...
95,NIO,Best Stocks to Buy Now: Is Nio Stock a Buy Aft...,(Motley Fool),Jun-12-23,04:57PM
96,NIO,Options Traders Score Cheap Premium After Nio ...,(Schaeffer's Investment Research),Jun-12-23,04:57PM
97,NIO,"NIO Q1 Loss Wider Than Expected, Vehicle Margi...",(Zacks),Jun-12-23,04:57PM
98,NIO,Why Nio Stock Shot Higher Today,(Motley Fool),Jun-12-23,04:57PM


In [7]:
NIO.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [8]:
# Extract LI Stock headline news
stock = 'LI'
news = {}

url = f'https://finviz.com/quote.ashx?t=LI&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

LI = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
LI

Unnamed: 0,stock,date-time,headline,source
0,LI,\r\n Aug-18-23 12:37PM\r\n,XPeng Succumbed To Tesla's Price War But Risin...,(Benzinga)
1,LI,\r\n Aug-17-23 01:33PM\r\n,VinFast Stock Falls for Second Day; Chinese EV...,(The Wall Street Journal)
2,LI,\r\n Aug-16-23 01:37PM\r\n,VinFast Stock Plunges After EV Maker's Blockbu...,(The Wall Street Journal)
3,LI,\r\n Aug-15-23 07:00AM\r\n,The 3 Most Undervalued EV Stocks to Buy Now: A...,(InvestorPlace)
4,LI,\r\n 05:30AM\r\n,Best Stocks to Buy Now: Rivian vs. Nio vs. Li ...,(Motley Fool)
...,...,...,...,...
95,LI,\r\n 11:03AM\r\n,Stellantis (STLA) & Archer Partnership Moves t...,(Zacks)
96,LI,\r\n Jun-23-23 05:00AM\r\n,"The Zacks Analyst Blog Highlights Tesla, Li Au...",(Zacks)
97,LI,\r\n Jun-22-23 10:07AM\r\n,3 Stocks to Watch as China Extends EV Tax Exem...,(Zacks)
98,LI,\r\n 07:30AM\r\n,The Best EV Stock to Buy Right Now? Li Auto. H...,(InvestorPlace)


In [9]:
LI['date-time'] = LI['date-time'].str.replace(r'\r\n', '')

LI['date-time'] = LI['date-time'].str.strip()
LI['date-time'] = pd.to_datetime(LI['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

LI['date-time'] = LI['date-time'].fillna(method='ffill')

LI['date'] = LI['date-time'].dt.strftime('%b-%d-%y')
LI['time'] = LI['date-time'].dt.strftime('%I:%M%p')

LI = LI.drop(columns=['date-time'])

LI

  LI['date-time'] = LI['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,LI,XPeng Succumbed To Tesla's Price War But Risin...,(Benzinga),Aug-18-23,12:37PM
1,LI,VinFast Stock Falls for Second Day; Chinese EV...,(The Wall Street Journal),Aug-17-23,01:33PM
2,LI,VinFast Stock Plunges After EV Maker's Blockbu...,(The Wall Street Journal),Aug-16-23,01:37PM
3,LI,The 3 Most Undervalued EV Stocks to Buy Now: A...,(InvestorPlace),Aug-15-23,07:00AM
4,LI,Best Stocks to Buy Now: Rivian vs. Nio vs. Li ...,(Motley Fool),Aug-15-23,07:00AM
...,...,...,...,...,...
95,LI,Stellantis (STLA) & Archer Partnership Moves t...,(Zacks),Jun-26-23,11:07AM
96,LI,"The Zacks Analyst Blog Highlights Tesla, Li Au...",(Zacks),Jun-23-23,05:00AM
97,LI,3 Stocks to Watch as China Extends EV Tax Exem...,(Zacks),Jun-22-23,10:07AM
98,LI,The Best EV Stock to Buy Right Now? Li Auto. H...,(InvestorPlace),Jun-22-23,10:07AM


In [10]:
LI.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [11]:
# Extract RIVN Stock headline news
stock = 'RIVN'
news = {}

url = f'https://finviz.com/quote.ashx?t=RIVN&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

RIVN = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
RIVN.head()

Unnamed: 0,stock,date-time,headline,source
0,RIVN,\r\n Aug-22-23 05:21PM\r\n,VinFast Stock Soars 109%. Its a Head-Scratcher.,(Barrons.com)
1,RIVN,\r\n 01:05PM\r\n,EV Stock VinFast Is Soaring Again. It Makes No...,(Barrons.com)
2,RIVN,\r\n 05:30AM\r\n,Rivian's Little-Known Competitive Advantage,(Motley Fool)
3,RIVN,\r\n Aug-21-23 11:52AM\r\n,Electric Pickups Are Upleveling The Power Of EVs,(Benzinga)
4,RIVN,\r\n 10:20AM\r\n,Rivian Is Making Huge Strides Toward Profitabi...,(Motley Fool)


In [12]:
RIVN['date-time'] = RIVN['date-time'].str.replace(r'\r\n', '')

RIVN['date-time'] = RIVN['date-time'].str.strip()
RIVN['date-time'] = pd.to_datetime(RIVN['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

RIVN['date-time'] = RIVN['date-time'].fillna(method='ffill')

RIVN['date'] = RIVN['date-time'].dt.strftime('%b-%d-%y')
RIVN['time'] = RIVN['date-time'].dt.strftime('%I:%M%p')

RIVN = RIVN.drop(columns=['date-time'])

RIVN

  RIVN['date-time'] = RIVN['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,RIVN,VinFast Stock Soars 109%. Its a Head-Scratcher.,(Barrons.com),Aug-22-23,05:21PM
1,RIVN,EV Stock VinFast Is Soaring Again. It Makes No...,(Barrons.com),Aug-22-23,05:21PM
2,RIVN,Rivian's Little-Known Competitive Advantage,(Motley Fool),Aug-22-23,05:21PM
3,RIVN,Electric Pickups Are Upleveling The Power Of EVs,(Benzinga),Aug-21-23,11:52AM
4,RIVN,Rivian Is Making Huge Strides Toward Profitabi...,(Motley Fool),Aug-21-23,11:52AM
...,...,...,...,...,...
95,RIVN,Rivian names Alan Hoffman as Chief Policy Officer,(Business Wire),Jul-31-23,09:00AM
96,RIVN,How Smart Money Is Playing The $700 Billion EV...,(Oilprice.com),Jul-30-23,08:00PM
97,RIVN,"12 Best Energy ETFs: Top Oil, Gas and Renewabl...",(Insider Monkey),Jul-29-23,11:05AM
98,RIVN,Will Tesla's Price Cuts Kill Rivian's Rally?,(Motley Fool),Jul-29-23,11:05AM


In [13]:
RIVN.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [14]:
# Extract XPeng Stock headline news
stock = 'XPEV'
news = {}

url = f'https://finviz.com/quote.ashx?t=XPEV&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

XPEV = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
XPEV.head()


Unnamed: 0,stock,date-time,headline,source
0,XPEV,\r\n Aug-21-23 04:54PM\r\n,XPeng Stock Soars Above Key Level Amid Hopes F...,(Investor's Business Daily)
1,XPEV,\r\n 11:21AM\r\n,XPeng Inc. (NYSE:XPEV) Q2 2023 Earnings Call T...,(Insider Monkey)
2,XPEV,\r\n 10:25AM\r\n,XPeng stock upgraded by BofA analyst on Volksw...,(Yahoo Finance Video)
3,XPEV,\r\n 10:19AM\r\n,XPeng Stock Surges on Upgrade. Positive Free C...,(Barrons.com)
4,XPEV,\r\n 10:18AM\r\n,"XPeng Soars To Regain Key Level, Rebounding Fr...",(Investor's Business Daily)


In [15]:
XPEV['date-time'] = XPEV['date-time'].str.replace(r'\r\n', '')

XPEV['date-time'] = XPEV['date-time'].str.strip()
XPEV['date-time'] = pd.to_datetime(XPEV['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

XPEV['date-time'] = XPEV['date-time'].fillna(method='ffill')

XPEV['date'] = XPEV['date-time'].dt.strftime('%b-%d-%y')
XPEV['time'] = XPEV['date-time'].dt.strftime('%I:%M%p')

XPEV = XPEV.drop(columns=['date-time'])

XPEV

  XPEV['date-time'] = XPEV['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,XPEV,XPeng Stock Soars Above Key Level Amid Hopes F...,(Investor's Business Daily),Aug-21-23,04:54PM
1,XPEV,XPeng Inc. (NYSE:XPEV) Q2 2023 Earnings Call T...,(Insider Monkey),Aug-21-23,04:54PM
2,XPEV,XPeng stock upgraded by BofA analyst on Volksw...,(Yahoo Finance Video),Aug-21-23,04:54PM
3,XPEV,XPeng Stock Surges on Upgrade. Positive Free C...,(Barrons.com),Aug-21-23,04:54PM
4,XPEV,"XPeng Soars To Regain Key Level, Rebounding Fr...",(Investor's Business Daily),Aug-21-23,04:54PM
...,...,...,...,...,...
95,XPEV,Nio Stock Eyes Return To Glory. Its EV Sales A...,(Investor's Business Daily),Jul-12-23,04:15PM
96,XPEV,LI Stock Prediction: Li Auto Will Be a Vehicle...,(InvestorPlace),Jul-11-23,07:31AM
97,XPEV,XPeng's Model Y Rival Hits The Road. This Will...,(Investor's Business Daily),Jul-10-23,04:21PM
98,XPEV,3 Energy Stocks That AI Predicts Will Deliver ...,(InvestorPlace),Jul-10-23,04:21PM


In [16]:
XPEV.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [17]:
# Extract LUCID Stock headline news
stock = 'LCID'
news = {}

url = f'https://finviz.com/quote.ashx?t=LCID&ty=c&ta=1&p=d'

headers = {'user-agent': 'news_scraper'}
request = Request(url=url, headers=headers)
response = urlopen(request)

html = BeautifulSoup(response, features='html.parser')
finviz_news_table = html.find(id='news-table')
news[stock] = finviz_news_table

news_parsed = []
for stock, news_item in news.items():
    for row in news_item.findAll('tr'):
        try:
            cells = row.findAll('td')
            date_time = cells[0].text
            headline = cells[1].a.getText()
            source = cells[1].span.getText()
            news_parsed.append([stock, date_time, headline, source])
        except:
            pass

LCID = pd.DataFrame(news_parsed, columns=['stock', 'date-time', 'headline', 'source'])
LCID.head()

Unnamed: 0,stock,date-time,headline,source
0,LCID,\r\n Aug-22-23 01:25PM\r\n,"Lucid CEO: Air EV price cuts 'well received,' ...",(Yahoo Finance)
1,LCID,\r\n 12:37PM\r\n,"Lucid returning to original price structure, C...",(Yahoo Finance Video)
2,LCID,\r\n Aug-15-23 05:30AM\r\n,Tesla Alums Win Big in Green Energy Bonanza,(The Wall Street Journal)
3,LCID,\r\n Aug-14-23 09:55AM\r\n,Tesla stock falls after slashing some Model Y ...,(Yahoo Finance Video)
4,LCID,\r\n Aug-09-23 09:13AM\r\n,Sentiment Check: Everyday Investors Shun Cycli...,(The Wall Street Journal)


In [18]:
LCID['date-time'] = LCID['date-time'].str.replace(r'\r\n', '')

LCID['date-time'] = LCID['date-time'].str.strip()
LCID['date-time'] = pd.to_datetime(LCID['date-time'], format='%b-%d-%y %I:%M%p', errors='coerce')

LCID['date-time'] = LCID['date-time'].fillna(method='ffill')

LCID['date'] = LCID['date-time'].dt.strftime('%b-%d-%y')
LCID['time'] = LCID['date-time'].dt.strftime('%I:%M%p')

LCID = LCID.drop(columns=['date-time'])

LCID

  LCID['date-time'] = LCID['date-time'].str.replace(r'\r\n', '')


Unnamed: 0,stock,headline,source,date,time
0,LCID,"Lucid CEO: Air EV price cuts 'well received,' ...",(Yahoo Finance),Aug-22-23,01:25PM
1,LCID,"Lucid returning to original price structure, C...",(Yahoo Finance Video),Aug-22-23,01:25PM
2,LCID,Tesla Alums Win Big in Green Energy Bonanza,(The Wall Street Journal),Aug-15-23,05:30AM
3,LCID,Tesla stock falls after slashing some Model Y ...,(Yahoo Finance Video),Aug-14-23,09:55AM
4,LCID,Sentiment Check: Everyday Investors Shun Cycli...,(The Wall Street Journal),Aug-09-23,09:13AM
...,...,...,...,...,...
95,LCID,Lucid stock declines after missing on Q4 reven...,(Yahoo Finance Video),Feb-23-23,04:17PM
96,LCID,EV Startups Have a New Bottleneck: Demand,(The Wall Street Journal),Feb-23-23,04:17PM
97,LCID,EV Startup Lucid Aims to Nearly Double Product...,(The Wall Street Journal),Feb-22-23,09:14PM
98,LCID,"Yahoo Finance Trending Tickers: Nvidia, Lucid,...",(Yahoo Finance),Feb-22-23,09:14PM


In [19]:
LCID.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     100 non-null    object
 1   headline  100 non-null    object
 2   source    100 non-null    object
 3   date      100 non-null    object
 4   time      100 non-null    object
dtypes: object(5)
memory usage: 4.0+ KB


In [20]:
df = pd.concat([TSLA, NIO, LI, RIVN, XPEV, LCID], ignore_index=True)
df

Unnamed: 0,stock,headline,source,date,time
0,TSLA,"Polestar Upgrades Model, Hits EV Production Mi...",(Barrons.com),Aug-23-23,09:07AM
1,TSLA,15 Highest Paying Countries for Engineers,(Insider Monkey),Aug-23-23,09:07AM
2,TSLA,Tesla's German plant lowers production target ...,(Reuters),Aug-23-23,09:07AM
3,TSLA,"Down 9% in the Past 5 Days, Is Now the Right T...",(Motley Fool),Aug-23-23,09:07AM
4,TSLA,Tesla's German plant lowers production target ...,(Reuters),Aug-23-23,09:07AM
...,...,...,...,...,...
595,LCID,Lucid stock declines after missing on Q4 reven...,(Yahoo Finance Video),Feb-23-23,04:17PM
596,LCID,EV Startups Have a New Bottleneck: Demand,(The Wall Street Journal),Feb-23-23,04:17PM
597,LCID,EV Startup Lucid Aims to Nearly Double Product...,(The Wall Street Journal),Feb-22-23,09:14PM
598,LCID,"Yahoo Finance Trending Tickers: Nvidia, Lucid,...",(Yahoo Finance),Feb-22-23,09:14PM


In [21]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 600 entries, 0 to 599
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   stock     600 non-null    object
 1   headline  600 non-null    object
 2   source    600 non-null    object
 3   date      600 non-null    object
 4   time      600 non-null    object
dtypes: object(5)
memory usage: 23.6+ KB


In [22]:
df['stock'].unique()

array(['TSLA', 'NIO', 'LI', 'RIVN', 'XPEV', 'LCID'], dtype=object)

In [23]:
df['date'].unique()

array(['Aug-23-23', 'Aug-22-23', 'Aug-21-23', 'Aug-20-23', 'Aug-19-23',
       'Aug-18-23', 'Aug-17-23', 'Aug-16-23', 'Aug-15-23', 'Aug-14-23',
       'Aug-10-23', 'Aug-09-23', 'Aug-08-23', 'Aug-07-23', 'Aug-06-23',
       'Aug-02-23', 'Aug-01-23', 'Jul-31-23', 'Jul-28-23', 'Jul-27-23',
       'Jul-26-23', 'Jul-25-23', 'Jul-24-23', 'Jul-21-23', 'Jul-20-23',
       'Jul-19-23', 'Jul-17-23', 'Jul-14-23', 'Jul-13-23', 'Jul-12-23',
       'Jul-11-23', 'Jul-10-23', 'Jul-09-23', 'Jul-07-23', 'Jul-06-23',
       'Jul-05-23', 'Jul-03-23', 'Jul-01-23', 'Jun-30-23', 'Jun-29-23',
       'Jun-28-23', 'Jun-27-23', 'Jun-26-23', 'Jun-24-23', 'Jun-23-23',
       'Jun-22-23', 'Jun-21-23', 'Jun-20-23', 'Jun-19-23', 'Jun-18-23',
       'Jun-16-23', 'Jun-15-23', 'Jun-14-23', 'Jun-13-23', 'Jun-12-23',
       'Aug-13-23', 'Aug-12-23', 'Aug-04-23', 'Aug-03-23', 'Jul-18-23',
       'Jul-02-23', 'Aug-11-23', 'Jul-30-23', 'Jul-29-23', 'Jun-08-23',
       'Jun-06-23', 'Jun-05-23', 'Jun-02-23', 'Jun-01-23', 'May-

In [24]:
df.to_csv('finviz_ev2.csv', index=False)

### References

1) https://companiesmarketcap.com/rivian/marketcap/
2) https://www.investors.com/news/best-ev-stocks-buy-now-electric-cars/
3) https://www.forbes.com/advisor/investing/best-ev-stocks/