## Program 1
The CNN Money’s Market Movers website (https://money.cnn.com/data/hotstocks/) tracks the most active stocks on a real-time basis. Specifically, the most active, the top gainers and the top losers are listed at any instance in time. This program collects the list of most actives, gainers, and losers from the above website. Next, this program takes the ticker symbols and names of these companies (and categories) to build a csv file (called something like <b>20191201-212519.csv</b> in which the title of the csv is determined by the time created saving it as the output year/month/day-hour/minute/seconds.csv) with data about each stock from the website: https://finance.yahoo.com/quote/AMD?p=AMD&.tsrc=fin-srch-v1, which gives the quote for ticker symbol AMD as an example. The data to be collected from the Yahoo Finance site includes:<br>
<blockquote>OPEN price<br>
PREV CLOSE price<br>
VOLUME<br>
MARKET CAP</blockquote>
This program also lists the names of the companies in the order and categories listed in the website https://money.cnn.com/data/hotstocks/ and prompts the user to choose a company to get the data on. Once the user chooses the company of interest, this program displays its corresponding data (Open, Prev Close, Volume, and Market Cap).<br>
<br>
If the web page or server no longer exist (this includes internet connection), the program will display an error message and end.
<br>
Sample Runs (user input in BOLD):<br>
<blockquote>This is a program to scrape data from the https://money.cnn.com/data/hotstocks/ for a class project.
<br>
Which stock are you interested in:<br>
<br>
Most Actives:<br>
AMD Advanced Micro Devices Inc<br>
GE General Electric Co<br>
BAC Bank of America Corp<br>
WBA Walgreens Boots Alliance Inc<br>
AAPL Apple Inc<br>
F Ford Motor Co<br>
FCX Freeport-McMoRan Inc<br>
CSCO Cisco Systems Inc<br>
OXY Occidental Petroleum Corp<br>
MU Micron Technology Inc<br>
<br>
Gainers:<br>
WBA Walgreens Boots Alliance Inc<br>
MKTX Marketaxess Holdings Inc<br>
NVR NVR Inc<br>
ARNC Arconic Inc<br>
GPS Gap Inc<br>
EQIX Equinix Inc<br>
ULTA Ulta Beauty Inc<br>
TTWO Take-Two Interactive Software Inc<br>
M Macy's Inc<br>
NWSA News Corp<br>
<br>
Losers:<br>
FCX Freeport-McMoRan Inc<br>
WYNN Wynn Resorts Ltd<br>
COTY Coty Inc<br>
CNP CenterPoint Energy Inc<br>
ABC AmerisourceBergen Corp<br>
MRO Marathon Oil Corp<br>
ATVI Activision Blizzard Inc<br>
COG Cabot Oil & Gas Corp<br>
XRAY Dentsply Sirona Inc<br>
<br>
User inputs: <b>COTY</b><br>
<br>
The data for COTY Coty Inc is the following:<br>
<br>
COTY Coty Inc<br>
OPEN: 12.78<br>
PREV CLOSE: 12.84 VOLUME: 2,000,995<br>
MARKET CAP: 9.580B<br>
<br></blockquote>
If the user enters something other than the listed company symbols, the program will display a message about the user's error and end.<br>
<br>
The csv should look something like this for one entry:<br>
<blockquote>Losers,COTY,Coty Inc,12.78,12.84, 2000995,9.580B</blockquote>
If the program cannot access the location to save a file with the stock data, the program will display an error message and end.

In [3]:
from urllib.request import urlopen
import requests
import re
import time

try:
    #Webscrap the whole page hotstocks from money.cnn.com
    myurl='https://money.cnn.com/data/hotstocks/'
    try:
        hotstocks_handle=requests.get(myurl)    
    except:
        print('The Money.CNN page or server cannot be found.')
        raise
    hotstocks_text=hotstocks_handle.text
    #Extract the company symbols and names of the companies that are among the listed most actives, gainers, and losers 
    regex = 'wsod_symbol\">([^<]+)[\s\S]{2,30}title[^>]+>([^<]+)'

    #Extract sections and boundaries of most actives, gainers and loser from the extracted page
    active = 'Most Actives[\s\S]*?Gainers'
    gainers = 'Gainers[\s\S]*?Losers'
    losers = 'Losers[\s\S]*'
    try:
        activeString = re.findall(active, hotstocks_text)
        gainersString = re.findall(gainers, hotstocks_text)
        losersString = re.findall(losers, hotstocks_text)
        #Determine and extract lists of companies that belong to the categories most actives, gainers, and losers
        activeArray = re.findall(regex, activeString[0])
        gainerArray = re.findall(regex, gainersString[0])
        losersArray = re.findall(regex, losersString[0])
    except:
        print('Tags cannot be found. You might want to check whether the Money.CNN webpage exists or the link to Money.CNN is correct.')
    #Webscrap stock data from finance.yahoo.com determined from the companies listed in activeArray, gainerArray, and losersArray
    a=[]
    b=[]
    c=[]

    for i in activeArray:
        stockurl='https://finance.yahoo.com/quote/'+i[0]+'?p='+i[0]+'&.tsrc=fin-srch-v1'
        try:
            stock=requests.get(stockurl)
        except:
            print('The Yahoo page or server cannot be found.')
            raise
        stock_text=stock.text
        regexStock='(Previous Close)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Open)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Volume)[\s\S]*?data-reactid[^>]*?>([\.\,\d]+)[\s\S]*?(Market Cap)[\s\S]*?data-reactid[^>]*?>([\.\dA-Za-z]+)'
        try:
            stringStock = re.findall(regexStock, stock_text)
            a.append({'Company Code':i[0], 'Company Name':i[1], 'Open':stringStock[0][3], 'Previous Close':stringStock[0][1], 'Volume':stringStock[0][5], 'Market Cap':stringStock[0][7]})
        except:
            print('Tags cannot be found. You might want to check whether the Yahoo webpage exists or the link to Yahoo is correct.')
            raise

    for i in gainerArray:
        stockurl='https://finance.yahoo.com/quote/'+i[0]+'?p='+i[0]+'&.tsrc=fin-srch-v1'
        try:
            stock=requests.get(stockurl)
        except:
            print('The Yahoo page or server cannot be found.')
            raise
        stock_text=stock.text
        regexStock='(Previous Close)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Open)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Volume)[\s\S]*?data-reactid[^>]*?>([\.\,\d]+)[\s\S]*?(Market Cap)[\s\S]*?data-reactid[^>]*?>([\.\dA-Za-z]+)'
        try:
            stringStock = re.findall(regexStock, stock_text)
            b.append({'Company Code':i[0], 'Company Name':i[1], 'Open':stringStock[0][3], 'Previous Close':stringStock[0][1], 'Volume':stringStock[0][5], 'Market Cap':stringStock[0][7]})
        except:
            print('Tags cannot be found. You might want to check whether the Yahoo webpage exists or the link to Yahoo is correct.')
            raise

    for i in losersArray:
        stockurl='https://finance.yahoo.com/quote/'+i[0]+'?p='+i[0]+'&.tsrc=fin-srch-v1'
        try:
            stock=requests.get(stockurl)
        except:
            print('The Yahoo page or server cannot be found.')
            raise
        stock_text=stock.text
        regexStock='(Previous Close)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Open)[\s\S]*?data-reactid[^>]*?>([\.\d]+)[\s\S]*?(Volume)[\s\S]*?data-reactid[^>]*?>([\.\,\d]+)[\s\S]*?(Market Cap)[\s\S]*?data-reactid[^>]*?>([\.\dA-Za-z]+)'
        try:
            stringStock = re.findall(regexStock, stock_text)
            c.append({'Company Code':i[0], 'Company Name':i[1], 'Open':stringStock[0][3], 'Previous Close':stringStock[0][1], 'Volume':stringStock[0][5], 'Market Cap':stringStock[0][7]})
        except:
            print('Tags cannot be found. You might want to check whether the Yahoo webpage exists or the link to Yahoo is correct.')
            raise

    #Prep the extracted companies among the most actives, gainers, and losers into lists prior to writing to file

    #Most Actives
    MA=[]
    for i in a:
        MAlist=[]
        for value in i.values():
            value=re.sub(r'[,]','',value)#remove commas from volume
            MAlist.append(value)
        MAlist=['Most Actives']+MAlist#add category
        MA.append(MAlist)

    #Gainers
    G=[]
    for i in b:
        Glist=[]
        for value in i.values():
            value=re.sub(r'[,]','',value)
            Glist.append(value)
        Glist=['Gainers']+Glist
        G.append(Glist)

    #Losers
    L=[]
    for i in c:
        Llist=[]
        for value in i.values():
            value=re.sub(r'[,]','',value)
            Llist.append(value)
        Llist=['Losers']+Llist
        L.append(Llist)

    #Concatenate all three categories into one list
    AllCats=[MA,G,L]

    #Write stock data to file
    timestr = time.strftime("%Y%m%d-%H%M%S"+".csv")#name the file with time and date
    try:
        filehandle=open(timestr,'w')
        for i in AllCats:
            for z in i:
                for y in z:      
                    filehandle.write("".join([str(y)]))
                    if y is z[-1]:
                        filehandle.write("\n")
                    elif y is not z[-1]:
                        filehandle.write(",") 
        filehandle.close()    
    except:
        print('The file cannot be created and written to an accessible location. Ensure that the location to write a file is accessible.')
        raise
    #Display the categories most actives, gainers, and losers for the user to choose a company symbol from
    d=[a,b,c]
    print('This is a program to scrape data from the https://money.cnn.com/data/hotstocks/ for a class project.\n')
    print('Which stock are you interested in:')
    print('')
    print('Most Actives:')
    for i in d[0]:
        print(i['Company Code'],i['Company Name'])
    print('')
    print('Gainers:')
    for i in d[1]:
        print(i['Company Code'],i['Company Name'])
    print('')
    print('Losers:')
    for i in d[2]:
        print(i['Company Code'],i['Company Name'])
    print('')
    #The user chooses a company symbol to see specific stock data.  
    user=str(input('User inputs: '))
    print('')
    hits=0
    try:
        for i in d:
            for z in i:
                if user == z['Company Code']:
                    hits=hits+1#found a match
                    #Display specific stock data
                    print('The data for',z['Company Code'],z['Company Name'],'is the following:')
                    print('')
                    print(z['Company Code'],z['Company Name'])
                    print('OPEN: ',z['Open'])
                    print('PREV CLOSE: ',z['Previous Close'])
                    print('VOLUME: ',z['Volume'])
                    print('MARKET CAP: ',z['Market Cap'])
                #Prevent stock data from being shown twice or more if a stock happens to be among more than one category    
                if hits is 1:
                    break
        #Catch any input from the user that does not match that from the company among the categories extracted
        if hits is 0:
            raise KeyError

    except KeyError:
        print('You did not enter a stock option among the selection given. Goodbye.')
except:
    print('Goodbye.')

This is a program to scrape data from the https://money.cnn.com/data/hotstocks/ for a class project.

Which stock are you interested in:

Most Actives:
GE General Electric Co
BAC Bank of America Corp
T AT&T Inc
AMD Advanced Micro Devices Inc
F Ford Motor Co
MSFT Microsoft Corp
AAPL Apple Inc
BMY Bristol-Myers Squibb Co
FCX Freeport-McMoRan Inc
HPQ HP Inc

Gainers:
NRG NRG Energy Inc
HPQ HP Inc
CF CF Industries Holdings Inc
DXC DXC Technology Co
CME CME Group Inc
CRM Salesforce.Com Inc
DRI Darden Restaurants Inc
ICE Intercontinental Exchange Inc
VRTX Vertex Pharmaceuticals Inc
ATVI Activision Blizzard Inc

Losers:
APA Apache Corp
FTI TechnipFMC PLC
DVN Devon Energy Corp
KSS Kohls Corp
HP Helmerich and Payne Inc
NBL Noble Energy Inc
URI United Rentals Inc
HBI HanesBrands Inc
LYB LyondellBasell Industries NV
EOG EOG Resources Inc

User inputs: APA

The data for APA Apache Corp is the following:

APA Apache Corp
OPEN:  22.78
PREV CLOSE:  23.22
VOLUME:  3,198,120
MARKET CAP:  8.378B
