### IPO price scraper - see IPO_price_scraper_v2.ipynb for updated version

In this script, I will scrape the stock info and company profile from Yahoo Finance for the IPO companies scraped in the previous step from NASDAQ site (see IPO_scraper.ipynb)

In [561]:
from difflib import SequenceMatcher
import pandas_datareader.data as pdr #Yahoo Finance ticker
import pandas as pd
import numpy as np

from datetime import datetime

from bs4 import BeautifulSoup
import requests
from time import sleep

In [636]:
df = pd.read_csv('ipo_clean_2010_2018.csv', parse_dates=['Date Priced', 'year'])
df.shape

(1600, 13)

In [639]:
df.dtypes

Company Name            object
Symbol                  object
Market                  object
Price                  float64
Shares                 float64
Offer Amount           float64
Date Priced     datetime64[ns]
employees              float64
address                 object
US_state                object
descriptions            object
link_nasdaq             object
year            datetime64[ns]
dtype: object

## Checking for duplicated tickers and company names:

In [590]:
print("number of duplicated companies in the dataset:", df['Company Name'].duplicated().sum())

number of duplicated companies in the dataset: 3


In [591]:
df[df['Company Name'].duplicated(keep=False)]

Unnamed: 0,Company Name,Symbol,Market,Price,Shares,Offer Amount,Date Priced,employees,address,US_state,descriptions,link_nasdaq,year
85,AMBOW EDUCATION HOLDING LTD.,AMBO,New York Stock Exchange,10.0,10677207,106772070,2010-08-05,10361.0,"12TH FLOOR, NO. 1 FINANCIAL STREETCHANG AN CEN...",,We are a leading national provider of educatio...,https://www.nasdaq.com/markets/ipos/company/am...,2010
205,SANDRIDGE ENERGY INC,SDT,New York Stock Exchange,21.0,15000000,315000000,2011-04-07,2192.0,"123 ROBERT S. KERR AVENUEOKLAHOMA CITY, OK 731...",OK,SandRidge Mississippian Trust I is a Delaware ...,https://www.nasdaq.com/markets/ipos/company/sa...,2011
254,SANDRIDGE ENERGY INC,PER,New York Stock Exchange,18.0,30000000,540000000,2011-08-11,2192.0,"123 ROBERT S. KERR AVENUEOKLAHOMA CITY, OK 731...",OK,SandRidge Permian Trust is a Delaware statutor...,https://www.nasdaq.com/markets/ipos/company/sa...,2011
341,SANDRIDGE ENERGY INC,SDR,New York Stock Exchange,21.0,26000000,546000000,2012-04-18,,"123 ROBERT S. KERR AVENUEOKLAHOMA CITY, OK 731...",OK,SandRidge Mississippian Trust II is a Delaware...,https://www.nasdaq.com/markets/ipos/company/sa...,2012
1503,AMBOW EDUCATION HOLDING LTD.,AMBO,NYSE MKT,4.25,1800000,7650000,2018-06-01,2657.0,"12TH FLOOR, NO. 1 FINANCIAL STREETCHANG AN CEN...",,"Our mission is to provide Better Schools, Bett...",https://www.nasdaq.com/markets/ipos/company/am...,2018


Commentary: Ambow Education LTD was enlisted twice. The first one it was liquedated after a scandal in 2014 [1]
Sandridge Energy Inc - all three are different trusts that belong to one company. Technically they are different. 

1. https://ir.theice.com/press/press-releases/nyse-regulation/2014/nyse-to-immediately-suspend-trading-in-american-depository

In [592]:
print("number of duplicated tickers in the dataset:", df['Symbol'].duplicated().sum())

number of duplicated tickers in the dataset: 7


In [593]:
df.iloc[:, :7][df['Symbol'].duplicated(keep=False)]

Unnamed: 0,Company Name,Symbol,Market,Price,Shares,Offer Amount,Date Priced
60,"HIGHER ONE HOLDINGS, INC.",ONE,New York Stock Exchange,12.0,9000000,108000000,2010-06-17
85,AMBOW EDUCATION HOLDING LTD.,AMBO,New York Stock Exchange,10.0,10677207,106772070,2010-08-05
97,SEACUBE CONTAINER LEASING LTD.,BOX,New York Stock Exchange,10.0,9500000,95000000,2010-10-28
293,BAZAARVOICE INC,BV,NASDAQ,12.0,9484296,113811552,2012-02-24
331,ACQUITY GROUP LTD,AQ,American Stock Exchange,6.0,5555556,33333336,2012-04-27
356,"TIAA FSB HOLDINGS, INC.",EVER,New York Stock Exchange,10.0,19220000,192200000,2012-05-03
715,QUOTIENT TECHNOLOGY INC.,COUP,New York Stock Exchange,16.0,10500000,168000000,2014-03-07
948,BOX INC,BOX,New York Stock Exchange,14.0,12500000,175000000,2015-01-23
1207,COUPA SOFTWARE INC,COUP,NASDAQ Global Select,18.0,7400000,133200000,2016-10-06
1374,AQUANTIA CORP,AQ,NYSE,9.0,6818000,61362000,2017-11-03


Commentary: There are companies that either got defunct and their tickers were given to new IPOs (like EVER) or there are companies that went to IPO twice (like AMBOW). Either way, because I will scrape stock data based on tickers from Yahoo Finance, which has information only for the most recent company, I keep here only those companies that went to IPO later. Others, I declare defunct.

In [594]:
defunct_oldticker = df['Symbol'][df['Symbol'].duplicated(keep='last')]

In [640]:
df = df[~df['Symbol'].duplicated(keep='last')]

In [641]:
df.shape

(1593, 13)

## Checking suspicious misalignment between company name and its ticker:

After scraping IPO companies I noticed that there are some misalignments between company names and their tickers. This could be due to the fact that some companies are now defunct and their tickers belong now to other companies. Or maybe the company is still functional but changed its ticker. Or maybe there was mistake when entering the ticker on Nasdaq website.

So in this step, I check if letters in the symbol correspond to company name. If they don't, I construct a dict of suspect companies. In the next step, I check these suspicious tickers on Yahoo Finance, retrieve their company names their and compare to company names on Nasdaq list.

In [552]:
companies_tocheck = {}
for i, symbol in enumerate(df.Symbol):
    for letter in symbol:
        #print(letter)
        if letter not in df.iloc[i, 0]:
            companies_tocheck.update({df.iloc[i, 0]: symbol})   

In [553]:
print(f"{len(companies_tocheck)} companies have suspicious tickers. Will be checked")

284 companies have suspicious tickers. Will be checked


In [554]:
url_1 = 'https://finance.yahoo.com/quote/%s/'
url_2 = 'profile?p=%s'
unmatched = {}
unmatched_tickers = []

for i, ticker in enumerate(list(companies_tocheck.values())):
    print(f"scraping {ticker} - {i+1}/{len(companies_tocheck)}")
    result = requests.get(url_1 % ticker + url_2 % ticker)
    content = result.content

    soup = BeautifulSoup(content)
    data = soup.find_all("h1", {"class":"D(ib) Fz(16px) Lh(18px)"})
    if data:
        txt = str([x.get_text() for x in data]).replace(ticker, "").lower()
        regex = re.compile("\w+")
        txt = " ".join(regex.findall(txt)).replace("corporation", "corp").replace("limited", "ltd")
        name = df['Company Name'][df['Symbol'] == ticker].values[0].lower().replace(",", "").replace(".", "").replace("corporation", "corp").replace("limited", "ltd")
        #checking with SequenceMatcher if two strings match for at least 80%
        if SequenceMatcher(None, txt, name).ratio() < 0.8:
            unmatched.update({name: txt})
            unmatched_tickers.append(ticker)

scraping CTC - 1/284
scraping STNG - 2/284
scraping FIBK - 3/284
scraping DHRM - 4/284
scraping AH - 5/284
scraping PLOW - 6/284
scraping SANWU - 7/284
scraping HSFT - 8/284
scraping VRNGU - 9/284
scraping CHKM - 10/284
scraping OINK - 11/284
scraping OXF - 12/284
scraping ECT - 13/284
scraping AMAP - 14/284
scraping GMAN - 15/284
scraping DMED - 16/284
scraping TSRX - 17/284
scraping SFUN - 18/284
scraping XRS - 19/284
scraping CCG - 20/284
scraping XNY - 21/284
scraping KFFG - 22/284
scraping VTUS - 23/284
scraping LAS - 24/284
scraping FXCM - 25/284
scraping DMD - 26/284
scraping IFT - 27/284
scraping MEDH - 28/284
scraping PCRX - 29/284
scraping TRNX - 30/284
scraping QIHU - 31/284
scraping MX - 32/284
scraping MKTG - 33/284
scraping TEU - 34/284
scraping GMLP - 35/284
scraping UAN - 36/284
scraping PER - 37/284
scraping TZYM - 38/284
scraping SZYM - 39/284
scraping SAVE - 40/284
scraping FENG - 41/284
scraping NQ - 42/284
scraping WIFI - 43/284
scraping FBNK - 44/284
scraping GSJK

In [615]:
print(f'there are {len(unmatched_tickers)} unmatched tickers. Will be removed')
unmatched

there are 24 unmatched tickers. Will be removed


{'energy corp of america': 'eca marcellus trust i',
 'g-estate liquidation stores inc': 'goldman sachs motif manufacturing revolution etf',
 'assembly biosciences inc': 'virgin trains usa llc',
 'mmodal inc': 'medx holdings inc',
 'wright medical group nv': 'taronis technologies inc',
 'sandridge energy inc': 'sandridge permian trust',
 'vereit inc': 'american realty capital propert',
 'aptiv plc': 'delphi technologies plc',
 'worldpay inc': 'vantiv inc class a',
 'hunt companies finance trust inc': 'five oaks investment corp',
 'pnmac holdings inc': 'pennymac financial services inc',
 'retailmenot inc': 'retailmenot inc series 1',
 'onemain holdings inc': 'leaf group ltd',
 'zenith energy logistics partners lp': 'arc logistic partners lp common',
 'eqt re llc': 'rice energy inc',
 'aravive inc': 'versartis inc',
 'plx pharma inc': 'dipexium pharmaceuticals inc',
 'immunic inc': 'vital therapies inc',
 'miragen therapeutics inc': 'signal genetics inc',
 'leap therapeutics inc': 'macroc

In [642]:
df = df[~df['Symbol'].isin(unmatched_tickers)]

In [643]:
df.shape

(1569, 13)

### 1.1. Scraping stock data

In [559]:
df_yahoo = pd.DataFrame()
notscraped = []
for i, ticker in enumerate(df.Symbol):
    try:
        print(f"scraping {ticker} - {i+1}/{len(df.Symbol)}")
        st = pdr.DataReader(ticker, "yahoo")
        firstday_open = st['Open'][0]
        firstday_spread = st['High'][0] - st['Low'][0] #diff between high and low price on the first day of trading
        firstday_volume = st['Volume'][0]
        firstday_adjclose = st['Adj Close'][0]
        inweek_open = st['Open'][7]
        inweek_spread = st['High'][7] - st['Low'][7]
        inweek_volume = st['Volume'][7]
        inweek_adjclose = st['Adj Close'][7]
        inmonth_open = st['Open'][30]
        inmonth_spread = st['High'][30] - st['Low'][30]
        inmonth_volume = st['Volume'][30]
        inmonth_adjclose = st['Adj Close'][30]
        ticker_df = pd.DataFrame({ticker: {'firstday_open': firstday_open, 'firstday_spread': firstday_spread, 
                                  'firstday_volume': firstday_volume, 'firstday_adjclose': firstday_adjclose,
                                  'inweek_open': inweek_open, 'inweek_spread': inweek_spread, 
                                  'inweek_volume': inweek_volume, 'inweek_adjclose': inweek_adjclose, 
                                  'inmonth_open': inmonth_open, 'inmonth_spread': inmonth_spread, 
                                  'inmonth_volume': inmonth_volume, 'inmonth_adjclose': inmonth_adjclose}}).T
        df_yahoo = df_yahoo.append(ticker_df)
    except:
        notscraped.append(ticker)
        continue

scraping CTC - 1/1569
scraping AMCF - 2/1569
scraping CHC - 3/1569
scraping CHSP - 4/1569
scraping CLU - 5/1569
scraping SYA - 6/1569
scraping GNRC - 7/1569
scraping GRM - 8/1569
scraping QNST - 9/1569
scraping TRNO - 10/1569
scraping PDM - 11/1569
scraping IRWD - 12/1569
scraping MERU - 13/1569
scraping SSNC - 14/1569
scraping STNG - 15/1569
scraping HTHT - 16/1569
scraping CRMD - 17/1569
scraping FIBK - 18/1569
scraping CALX - 19/1569
scraping MXL - 20/1569
scraping FNGN - 21/1569
scraping CRU - 22/1569
scraping AVEO - 23/1569
scraping ST - 24/1569
scraping BALT - 25/1569
scraping ANTH - 26/1569
scraping PNG - 27/1569
scraping CNVO - 28/1569
scraping AOSL - 29/1569
scraping EXL - 30/1569
scraping GGS - 31/1569
scraping MITL - 32/1569
scraping CDXS - 33/1569
scraping ALIM - 34/1569
scraping DHRM - 35/1569
scraping DVOX - 36/1569
scraping SPSC - 37/1569
scraping RLOC - 38/1569
scraping AH - 39/1569
scraping CLDT - 40/1569
scraping MUSA - 41/1569
scraping TNGN - 42/1569
scraping PRI - 4

scraping TLYS - 339/1569
scraping PDH - 340/1569
scraping ROYT - 341/1569
scraping CG - 342/1569
scraping SUPN - 343/1569
scraping NOW - 344/1569
scraping TSRO - 345/1569
scraping EXA - 346/1569
scraping EQM - 347/1569
scraping CNCO - 348/1569
scraping AMRE - 349/1569
scraping DFRG - 350/1569
scraping EOPN - 351/1569
scraping HPTX - 352/1569
scraping NTI - 353/1569
scraping NGVC - 354/1569
scraping CHUY - 355/1569
scraping PANW - 356/1569
scraping KYAK - 357/1569
scraping DRTX - 358/1569
scraping FIVE - 359/1569
scraping FSBW - 360/1569
scraping HCLP - 361/1569
scraping MANU - 362/1569
scraping PFMT - 363/1569
scraping PSMI - 364/1569
scraping BLMN - 365/1569
scraping GMED - 366/1569
scraping ELOQ - 367/1569
scraping QLYS - 368/1569
scraping SMLP - 369/1569
scraping CBF - 370/1569
scraping NBHC - 371/1569
scraping SRC - 372/1569
scraping SUSP - 373/1569
scraping TRLA - 374/1569
scraping MPLX - 375/1569
scraping WWAV - 376/1569
scraping MEILU - 377/1569
scraping LGP - 378/1569
scraping 

scraping CBSO - 670/1569
scraping EVDY - 671/1569
scraping HIVE - 672/1569
scraping TWOU - 673/1569
scraping WATT - 674/1569
scraping SQBK - 675/1569
scraping TNET - 676/1569
scraping AGTC - 677/1569
scraping KING - 678/1569
scraping NORD - 679/1569
scraping AMBR - 680/1569
scraping ATEN - 681/1569
scraping BRDR - 682/1569
scraping RTGN - 683/1569
scraping QTWO - 684/1569
scraping AKBA - 685/1569
scraping MDWD - 686/1569
scraping PCTY - 687/1569
scraping RXDX - 688/1569
scraping CSLT - 689/1569
scraping GLMD - 690/1569
scraping AKAO - 691/1569
scraping AQXP - 692/1569
scraping REPH - 693/1569
scraping EVAR - 694/1569
scraping VGGL - 695/1569
scraping QTNTU - 696/1569
scraping SABR - 697/1569
scraping SPWH - 698/1569
scraping LEJU - 699/1569
scraping MC - 700/1569
scraping TRIV - 701/1569
scraping PAYC - 702/1569
scraping CIO - 703/1569
scraping WB - 704/1569
scraping ZOES - 705/1569
scraping PAHC - 706/1569
scraping FPI - 707/1569
scraping ENBL - 708/1569
scraping CERU - 709/1569
scrap

scraping IVTY - 1001/1569
scraping WING - 1002/1569
scraping PUB - 1003/1569
scraping AXON - 1004/1569
scraping BITI - 1005/1569
scraping EVH - 1006/1569
scraping DTEA - 1007/1569
scraping GI - 1008/1569
scraping PTXP - 1009/1569
scraping AQMS - 1010/1569
scraping GLBL - 1011/1569
scraping XELB - 1012/1569
scraping VTVT - 1013/1569
scraping NK - 1014/1569
scraping NEOS - 1015/1569
scraping LOB - 1016/1569
scraping BUFF - 1017/1569
scraping MCFT - 1018/1569
scraping OOMA - 1019/1569
scraping RPD - 1020/1569
scraping OLLI - 1021/1569
scraping JP - 1022/1569
scraping CHMA - 1023/1569
scraping HLG - 1024/1569
scraping NTRA - 1025/1569
scraping CNXC - 1026/1569
scraping CFMS - 1027/1569
scraping TDOC - 1028/1569
scraping UFAB - 1029/1569
scraping BNTC - 1030/1569
scraping HLI - 1031/1569
scraping GBT - 1032/1569
scraping AIMT - 1033/1569
scraping PLNT - 1034/1569
scraping RUN - 1035/1569
scraping ZYNE - 1036/1569
scraping BETR - 1037/1569
scraping NTEC - 1038/1569
scraping NBRV - 1039/1569


scraping HAIR - 1319/1569
scraping CARG - 1320/1569
scraping SWCH - 1321/1569
scraping RYTM - 1322/1569
scraping BOXL - 1323/1569
scraping RETO - 1324/1569
scraping BXG - 1325/1569
scraping AMRH - 1326/1569
scraping SAIL - 1327/1569
scraping SBT - 1328/1569
scraping SCPH - 1329/1569
scraping SFIX - 1330/1569
scraping JT - 1331/1569
scraping ASNS - 1332/1569
scraping SEND - 1333/1569
scraping PPDF - 1334/1569
scraping BAND - 1335/1569
scraping ERYP - 1336/1569
scraping APLS - 1337/1569
scraping SOGO - 1338/1569
scraping MCB - 1339/1569
scraping FEDU - 1340/1569
scraping IFRX - 1341/1569
scraping CBTX - 1342/1569
scraping HX - 1343/1569
scraping AQ - 1344/1569
scraping ACMR - 1345/1569
scraping SPRO - 1346/1569
scraping ALNA - 1347/1569
scraping AQUA - 1348/1569
scraping FNKO - 1349/1569
scraping ALTR - 1350/1569
scraping LOMA - 1351/1569
scraping ICLK - 1352/1569
scraping LX - 1353/1569
scraping DOGZ - 1354/1569
scraping CASA - 1355/1569
scraping NMRK - 1356/1569
scraping GIG'U - 1357/1

## Checking companies not scraped in Yahoo Finance:

In this part, I will scrutinize the stocks that were nor parsed on Yahoo Finance. There could be two reasons for this - either the company is defunct (merged/was acquired or otherwise delisted from the stock exchange) or the company is still funcational but changed the ticker. In the latter case, we want to include them into our IPO list and redo the scraping from Yahoo Finance for the new ticker. <br>

To check if the company is defunct, I utilize Seeking Alpha website that provides exactly the info we need.

In [560]:
print(f"From {len(df.Symbol)} IPO companies listed on Nasdaq, {len(notscraped)} companies were not scraped on Yahoo Finance")

From 1569 IPO companies listed on Nasdaq, 433 companies were not scraped on Yahoo Finance


In [562]:
defunct = []
newticker = []
acquired = []

url = 'https://seekingalpha.com/symbol/%s'
    
notscraped_df = pd.DataFrame()
headers = {'User-Agent' : "non-profit learning project"}

for ticker in notscraped:
    print(f'scrap - {url % ticker}')
    result = requests.get(url % ticker, headers=headers).content
    sleep(15)
    
    soup = BeautifulSoup(result)
    m = soup.find_all("div", {"class": "defunct_message defunct_message_etf"})
    l = soup.find_all("link", {"rel": "canonical"})
    if m:
        print(m)
        defunct.append(ticker)
    elif l:
        print(l)
        nt = str(l).replace('[<link href="https://seekingalpha.com/symbol/', "").replace('\" rel="canonical"/>]', '')
        newticker.append(nt)
    else:
        acquired.append(ticker)

scrap - https://seekingalpha.com/symbol/CTC
[<div class="defunct_message defunct_message_etf">CTC is defunct.</div>]
scrap - https://seekingalpha.com/symbol/CHC
[<div class="defunct_message defunct_message_etf">CHC is defunct.</div>]
scrap - https://seekingalpha.com/symbol/CLU
[<div class="defunct_message defunct_message_etf">CLU is defunct.</div>]
scrap - https://seekingalpha.com/symbol/SYA
[<div class="defunct_message defunct_message_etf">SYA is defunct.</div>]
scrap - https://seekingalpha.com/symbol/GRM
[<div class="defunct_message defunct_message_etf">GRM is defunct.</div>]
scrap - https://seekingalpha.com/symbol/MERU
[<div class="defunct_message defunct_message_etf">MERU is defunct.</div>]
scrap - https://seekingalpha.com/symbol/FNGN
[<div class="defunct_message defunct_message_etf">FNGN is defunct.</div>]
scrap - https://seekingalpha.com/symbol/CRU
[<div class="defunct_message defunct_message_etf">CRU is defunct.</div>]
scrap - https://seekingalpha.com/symbol/BALT
[<div class="de

[<link href="https://seekingalpha.com/symbol/FBNK" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/AWAY
[<div class="defunct_message defunct_message_etf">AWAY is defunct.</div>]
scrap - https://seekingalpha.com/symbol/KIOR
[<div class="defunct_message defunct_message_etf">KIOR is defunct.</div>]
scrap - https://seekingalpha.com/symbol/VHS
[<div class="defunct_message defunct_message_etf">VHS is defunct.</div>]
scrap - https://seekingalpha.com/symbol/RATE
[<div class="defunct_message defunct_message_etf">RATE is defunct.</div>]
scrap - https://seekingalpha.com/symbol/GSJK
[<link href="https://seekingalpha.com/symbol/CCLP" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/P
[<link href="https://seekingalpha.com/symbol/P" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/FIO
[<div class="defunct_message defunct_message_etf">FIO is defunct.</div>]
scrap - https://seekingalpha.com/symbol/TAOM
[<div class="defunct_message defunct_message_etf">TAOM is defunct.</di

[<link href="https://seekingalpha.com/symbol/CAPL" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/LNCO
[<link href="https://seekingalpha.com/symbol/LNCOQ" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/KYTH
[<div class="defunct_message defunct_message_etf">KYTH is defunct.</div>]
scrap - https://seekingalpha.com/symbol/FLTX
[<div class="defunct_message defunct_message_etf">FLTX is defunct.</div>]
scrap - https://seekingalpha.com/symbol/LOCK
[<div class="defunct_message defunct_message_etf">LOCK is defunct.</div>]
scrap - https://seekingalpha.com/symbol/JMI
[<div class="defunct_message defunct_message_etf">JMI is defunct.</div>]
scrap - https://seekingalpha.com/symbol/ALDW
[<div class="defunct_message defunct_message_etf">ALDW is defunct.</div>]
scrap - https://seekingalpha.com/symbol/RKUS
[<div class="defunct_message defunct_message_etf">RKUS is defunct.</div>]
scrap - https://seekingalpha.com/symbol/SXE
[<div class="defunct_message defunct_message_etf">SXE is d

[<div class="defunct_message defunct_message_etf">MEP is defunct.</div>]
scrap - https://seekingalpha.com/symbol/MVNR
[<div class="defunct_message defunct_message_etf">MVNR is defunct.</div>]
scrap - https://seekingalpha.com/symbol/NCFT
[<div class="defunct_message defunct_message_etf">NCFT is defunct.</div>]
scrap - https://seekingalpha.com/symbol/CUDA
[<div class="defunct_message defunct_message_etf">CUDA is defunct.</div>]
scrap - https://seekingalpha.com/symbol/QUNR
[<div class="defunct_message defunct_message_etf">QUNR is defunct.</div>]
scrap - https://seekingalpha.com/symbol/CQH
[<link href="https://seekingalpha.com/symbol/CQH" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/NMBL
[<div class="defunct_message defunct_message_etf">NMBL is defunct.</div>]
scrap - https://seekingalpha.com/symbol/FGL
[<link href="https://seekingalpha.com/symbol/FG" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/VLP
[<link href="https://seekingalpha.com/symbol/VLP" rel="canonica

[<link href="https://seekingalpha.com/symbol/CIVI" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/ABCW
scrap - https://seekingalpha.com/symbol/DM
[<link href="https://seekingalpha.com/symbol/DM" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/GBSN
scrap - https://seekingalpha.com/symbol/MOLG
scrap - https://seekingalpha.com/symbol/YDLE
[<div class="defunct_message defunct_message_etf">YDLE is defunct.</div>]
scrap - https://seekingalpha.com/symbol/JPEP
[<div class="defunct_message defunct_message_etf">JPEP is defunct.</div>]
scrap - https://seekingalpha.com/symbol/VWR
[<div class="defunct_message defunct_message_etf">VWR is defunct.</div>]
scrap - https://seekingalpha.com/symbol/NEFF
[<div class="defunct_message defunct_message_etf">NEFF is defunct.</div>]
scrap - https://seekingalpha.com/symbol/NEOT
[<link href="https://seekingalpha.com/symbol/EVFM" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/CNV
[<link href="https://seekingalpha.com/symbol/CNVAF"

scrap - https://seekingalpha.com/symbol/WRD
[<link href="https://seekingalpha.com/symbol/WRD" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/MULE
scrap - https://seekingalpha.com/symbol/NDRAU
scrap - https://seekingalpha.com/symbol/TNTR
scrap - https://seekingalpha.com/symbol/BSTI
scrap - https://seekingalpha.com/symbol/IPOAU
scrap - https://seekingalpha.com/symbol/ABLX
scrap - https://seekingalpha.com/symbol/SEND
[<link href="https://seekingalpha.com/symbol/SEND" rel="canonical"/>]
scrap - https://seekingalpha.com/symbol/GIG'U
scrap - https://seekingalpha.com/symbol/ARMO
[<div class="defunct_message defunct_message_etf">ARMO is defunct since January  1, 2019. Merged</div>]
scrap - https://seekingalpha.com/symbol/XSPL
scrap - https://seekingalpha.com/symbol/BNGOU


In [563]:
print(f"from {len(df.Symbol)} IPO companies listed on Nasdaq, {len(defunct)} companies are defunct as of May 2019, {len(acquired)} were probably acquired and {len(newticker)} changed their ticker")

from 1569 IPO companies listed on Nasdaq, 215 companies are defunct as of May 2019, 124 were probably acquired and 94 changed their ticker


In [566]:
acquired

['DVOX',
 'AH',
 'TNGN',
 'VRNGU',
 'BSFT',
 'CHKM',
 'MCP',
 'CIS',
 'OINK',
 'QLIK',
 'RLD',
 'SMT',
 'OXF',
 'AMAP',
 'KH',
 'IL',
 'PATH',
 'DMED',
 'TSRX',
 'ELT',
 'RNO',
 'CCSC',
 'GAGA',
 'EXAM',
 'MCOX',
 'AEGR',
 'BBRG',
 'XRS',
 'SHP',
 'NTSP',
 'BODY',
 'CCG',
 'GEDU',
 'ANAC',
 'SYSW',
 'XNY',
 'ARX',
 'LZEN',
 'KFFG',
 'TFM',
 'CTE',
 'SODA',
 'XUE',
 'SPBC',
 'QRE',
 'SWFT',
 'ISS',
 'LAS',
 'MOBI',
 'BONA',
 'CTP',
 'YOKU',
 'FXCM',
 'BCDS',
 'VELT',
 'DMD',
 'ZA',
 'KIPS',
 'IFT',
 'ECYT',
 'TBOW',
 'EPOC',
 'TZYM',
 'SZYM',
 'LPR',
 'FFN',
 'DATE',
 'NQ',
 'DDMG',
 'INVN',
 'LRE',
 'NRGM',
 'KORS',
 'GZT',
 'SN',
 'WGP',
 'KBIO',
 'LPDX',
 'CVRR',
 'NSLP',
 'NVEEU',
 'NBCB',
 'BIOAU',
 'CYNI',
 'ARPI',
 'RGDO',
 'EVHC',
 'CCCR',
 'QEPM',
 'STCK',
 'CVT',
 'WPT',
 'YUME',
 'HELI',
 'LMNS',
 'AMDA',
 'TLMR',
 'CFRXU',
 'ABCW',
 'GBSN',
 'MOLG',
 'MDVXU',
 'RMP',
 'MPG',
 'CNXR',
 'TEGP',
 'CLCD',
 'XTLY',
 'YECO',
 'LMFAU',
 'CERCU',
 'PAVMU',
 'ONSIU',
 'SRTSU',
 'TIG'

In [568]:
sorted(defunct)
df[df['Symbol'].isin(defunct)]

Unnamed: 0,Company Name,Symbol,Market,Price,Shares,Offer Amount,Date Priced,employees,address,US_state,descriptions,link_nasdaq
0,IFM INVESTMENTS LTD,CTC,New York Stock Exchange,$7,12487500,"$87,412,500",1/28/2010,4654.0,"9/A5, EAST WING, HANWEI PLAZANO.7 GUANGHUA ROA...",,We are a leading comprehensive real estate ser...,https://www.nasdaq.com/markets/ipos/company/if...
2,CHINA HYDROELECTRIC CORP,CHC,New York Stock Exchange,$16,6000000,"$96,000,000",1/25/2010,336.0,"420 LEXINGTON AVENUESUITE 860NEW YORK, NY 10170",NY,"We are a fast-growing consolidator, operator a...",https://www.nasdaq.com/markets/ipos/company/ch...
4,"CELLU TISSUE HOLDINGS, INC.",CLU,New York Stock Exchange,$13,8300000,"$107,900,000",1/22/2010,1160.0,"3442 FRANCIS ROADSUITE 220ALPHARETTA, GA 30004",GA,We are a North American producer of tissue pro...,https://www.nasdaq.com/markets/ipos/company/ce...
5,SYMETRA FINANCIAL CORP,SYA,New York Stock Exchange,$12,30400000,"$364,800,000",1/22/2010,1100.0,"777 108TH AVENUE NESUITE 1200BELLEVUE, WA 9800...",WA,We are a life insurance company focused on pro...,https://www.nasdaq.com/markets/ipos/company/sy...
7,GRAHAM PACKAGING CO INC.,GRM,New York Stock Exchange,$10,16666667,"$166,666,670",2/11/2010,7400.0,"2401 PLEASANT VALLEY ROADYORK, PA 17402",PA,"We are a worldwide leader in the design, manuf...",https://www.nasdaq.com/markets/ipos/company/gr...
12,MERU NETWORKS INC,MERU,NASDAQ,$15,4386784,"$65,801,760",3/31/2010,242.0,"894 ROSS DRIVESUNNYVALE, CA 94089",CA,We provide a virtualized wireless LAN solution...,https://www.nasdaq.com/markets/ipos/company/me...
20,"FINANCIAL ENGINES, LLC",FNGN,NASDAQ,$12,10600000,"$127,200,000",3/16/2010,264.0,"1050 ENTERPRISE WAY, 3RD FLSUNNYVALE, CA 94089",CA,Our company was founded to address the need fo...,https://www.nasdaq.com/markets/ipos/company/fi...
21,CRUDE CARRIERS CORP.,CRU,New York Stock Exchange,$19,13500000,"$256,500,000",3/12/2010,,IASSONOS 3PIRAEUS 18537,,We are a newly formed transportation company i...,https://www.nasdaq.com/markets/ipos/company/cr...
24,BALTIC TRADING LTD,BALT,New York Stock Exchange,$14,16300000,"$228,200,000",3/10/2010,,"299 PARK AVENUE, 12TH FLOORNEW YORK, NY 10171",NY,We are a newly formed New York City-based comp...,https://www.nasdaq.com/markets/ipos/company/ba...
26,PAA NATURAL GAS STORAGE LP,PNG,New York Stock Exchange,$21.50,11720000,"$251,980,000",4/30/2010,,"333 CLAY STREET, SUITE 1500HOUSTON, TX 77002",TX,"We are a fee-based, growth-oriented Delaware l...",https://www.nasdaq.com/markets/ipos/company/pa...


In [602]:
newticker[:10]

['GEGSQ',
 'MITL',
 'LLIT',
 'NORNQ',
 'SANW',
 'VLTC',
 'ANDX',
 'STIR',
 'GNMX',
 'MSFT']

In [569]:
df_yahoo.head()

Unnamed: 0,firstday_adjclose,firstday_open,firstday_spread,firstday_volume,inmonth_adjclose,inmonth_open,inmonth_spread,inmonth_volume,inweek_adjclose,inweek_open,inweek_spread,inweek_volume
AMCF,5.77,6.3,0.79,1113500.0,7.59,8.0,0.79,323300.0,5.83,6.03,0.28,106300.0
CHSP,11.996906,19.0,0.150002,153200.0,12.50808,19.4,0.51,40500.0,11.927483,18.91,0.139999,75900.0
GNRC,8.460629,13.0,0.85,9627100.0,9.71918,14.91,0.929999,265400.0,8.612183,13.05,0.29,110000.0
QNST,15.0,15.0,0.7,5815700.0,16.41,16.01,0.749999,27300.0,12.98,13.5,0.809999,504000.0
TRNO,14.692022,18.75,0.42,4515300.0,15.330121,19.530001,0.190001,64300.0,14.770803,18.809999,0.199999,91200.0


In [570]:
df_yahoo.shape

(1136, 12)

## Joining two dataframes - from NASDAQ and from Yahoo Finance:

In [644]:
df.index = df.Symbol

In [645]:
df_ipo = df.join(df_yahoo, how='inner')
df_ipo.head()

Unnamed: 0,Company Name,Symbol,Market,Price,Shares,Offer Amount,Date Priced,employees,address,US_state,...,firstday_spread,firstday_volume,inmonth_adjclose,inmonth_open,inmonth_spread,inmonth_volume,inweek_adjclose,inweek_open,inweek_spread,inweek_volume
AMCF,ANDATEE CHINA MARINE FUEL SERVICES CORP,AMCF,NASDAQ,6.3,3134921.0,19750002.0,2010-01-26,128.0,NO. 68 BINHAI RD DALIAN XIGANG DISTRICTDALIAN ...,,...,0.79,1113500.0,7.59,8.0,0.79,323300.0,5.83,6.03,0.28,106300.0
CHSP,CHESAPEAKE LODGING TRUST,CHSP,New York Stock Exchange,20.0,7500000.0,150000000.0,2010-01-22,3.0,"4300 WILSON BOULEVARDSUITE 625ARLINGTON, VA 22203",VA,...,0.150002,153200.0,12.50808,19.4,0.51,40500.0,11.927483,18.91,0.139999,75900.0
GNRC,GENERAC HOLDINGS INC.,GNRC,New York Stock Exchange,13.0,18750000.0,243750000.0,2010-02-11,1486.0,"S45 W29290 HIGHWAY 59WAUKESHA, WI 53187",WI,...,0.85,9627100.0,9.71918,14.91,0.929999,265400.0,8.612183,13.05,0.29,110000.0
QNST,"QUINSTREET, INC",QNST,NASDAQ,15.0,10000000.0,150000000.0,2010-02-11,568.0,"950 TOWER LANE, 6TH FLOORFOSTER CITY, CA 94404",CA,...,0.7,5815700.0,16.41,16.01,0.749999,27300.0,12.98,13.5,0.809999,504000.0
TRNO,TERRENO REALTY CORP,TRNO,New York Stock Exchange,20.0,8750000.0,175000000.0,2010-02-10,6.0,"16 MAIDEN LANEFIFTH FLOORSAN FRANCISCO, CA 94108",CA,...,0.42,4515300.0,15.330121,19.530001,0.190001,64300.0,14.770803,18.809999,0.199999,91200.0


In [646]:
df_ipo.shape

(1136, 25)

In [621]:
df_ipo.to_csv('ipo_stock_2010_2018.csv', index=False)

## 1.2. Scraping profile and industry from Yahoo Finance

In this step, I scrape sector, industry, employees as of now, if available CEO pay and CEO year born from Yahoo Finance. 

In [575]:
sector = []
industry = []
employees2019 = []
CEO_pay = []
CEO_born = []

url_1 = 'https://finance.yahoo.com/quote/%s/'
url_2 = 'profile?p=%s'

for i, ticker in enumerate(df_ipo.Symbol):
    print(f"scraping {ticker} - {i+1}/{len(df_ipo.Symbol)}")
    result = requests.get(url_1 % ticker + url_2 % ticker)
    content = result.content

    soup = BeautifulSoup(content)
    data = soup.find_all('span', {"class":"Fw(600)"})
    if data:
        txt = [x.get_text() for x in data]
        sector.append(txt[0])
        industry.append(txt[1])
        employees2019.append(txt[2])

    else:
        sector.append(np.nan)
        industry.append(np.nan)
        employees2019.append(np.nan)
    
    try:
        table = pd.read_html(content)[0]
        #checking if table has CEO 
        if ('Title' in table) and (table['Title'].str.contains('CEO').sum() == 1):
            CEO_pay.append(table['Pay'][table['Title'].str.contains('CEO')].values[0]) 
            CEO_born.append(table['Year Born'][table['Title'].str.contains('CEO')].values[0])
        else:
            CEO_pay.append(np.nan) 
            CEO_born.append(np.nan)

    except ValueError:
        CEO_pay.append(np.nan) 
        CEO_born.append(np.nan)
    
print(f"checking lengths sector: {len(sector)}, industry: {len(industry)}, employees2019: {len(employees2019)}, CEO pay: {len(CEO_pay)}, CEO_born: {len(CEO_born)}")

scraping AMCF - 1/1136
scraping CHSP - 2/1136
scraping GNRC - 3/1136
scraping QNST - 4/1136
scraping TRNO - 5/1136
scraping PDM - 6/1136
scraping IRWD - 7/1136
scraping SSNC - 8/1136
scraping STNG - 9/1136
scraping HTHT - 10/1136
scraping CRMD - 11/1136
scraping FIBK - 12/1136
scraping CALX - 13/1136
scraping MXL - 14/1136
scraping AVEO - 15/1136
scraping ST - 16/1136
scraping ANTH - 17/1136
scraping AOSL - 18/1136
scraping CDXS - 19/1136
scraping ALIM - 20/1136
scraping SPSC - 21/1136
scraping CLDT - 22/1136
scraping MUSA - 23/1136
scraping PRI - 24/1136
scraping GNMK - 25/1136
scraping JKS - 26/1136
scraping EXPR - 27/1136
scraping RRTS - 28/1136
scraping TNAV - 29/1136
scraping PLOW - 30/1136
scraping TSLA - 31/1136
scraping FN - 32/1136
scraping OAS - 33/1136
scraping BORN - 34/1136
scraping GDOT - 35/1136
scraping AMRC - 36/1136
scraping WSR - 37/1136
scraping ELMD - 38/1136
scraping MMYT - 39/1136
scraping RP - 40/1136
scraping NXPI - 41/1136
scraping AMRS - 42/1136
scraping COR 

scraping TWTR - 338/1136
scraping WIX - 339/1136
scraping KPTI - 340/1136
scraping AVH - 341/1136
scraping BCRH - 342/1136
scraping TCS - 343/1136
scraping AMC - 344/1136
scraping HLT - 345/1136
scraping KIN - 346/1136
scraping SALT - 347/1136
scraping TLOG - 348/1136
scraping CTT - 349/1136
scraping ARMK - 350/1136
scraping ATHM - 351/1136
scraping XNCR - 352/1136
scraping CARA - 353/1136
scraping NWHM - 354/1136
scraping RARE - 355/1136
scraping MBUU - 356/1136
scraping TRVN - 357/1136
scraping DRNA - 358/1136
scraping CRCM - 359/1136
scraping AKER - 360/1136
scraping SC - 361/1136
scraping EPE - 362/1136
scraping CELP - 363/1136
scraping GLYC - 364/1136
scraping RTRX - 365/1136
scraping VRNS - 366/1136
scraping SMLR - 367/1136
scraping INGN - 368/1136
scraping IBP - 369/1136
scraping CNCE - 370/1136
scraping EGRX - 371/1136
scraping FLXN - 372/1136
scraping GPRK - 373/1136
scraping LADR - 374/1136
scraping RVNC - 375/1136
scraping QURE - 376/1136
scraping GNCA - 377/1136
scraping CB

scraping EQBK - 669/1136
scraping KURA - 670/1136
scraping FORK - 671/1136
scraping CCRC - 672/1136
scraping YRD - 673/1136
scraping TEAM - 674/1136
scraping PTI - 675/1136
scraping BGNE - 676/1136
scraping EDIT - 677/1136
scraping CRVS - 678/1136
scraping SENS - 679/1136
scraping HCM - 680/1136
scraping SNDX - 681/1136
scraping GWRS - 682/1136
scraping RRR - 683/1136
scraping YIN - 684/1136
scraping SCWX - 685/1136
scraping ARA - 686/1136
scraping MGP - 687/1136
scraping AGLE - 688/1136
scraping GMS - 689/1136
scraping RETA - 690/1136
scraping USFD - 691/1136
scraping MSBI - 692/1136
scraping MRUS - 693/1136
scraping SUPV - 694/1136
scraping PLSE - 695/1136
scraping ACIA - 696/1136
scraping SITE - 697/1136
scraping TPB - 698/1136
scraping SBPH - 699/1136
scraping NTLA - 700/1136
scraping SYRS - 701/1136
scraping GMRE - 702/1136
scraping TWLO - 703/1136
scraping SELB - 704/1136
scraping VIVE - 705/1136
scraping COE - 706/1136
scraping ATKR - 707/1136
scraping CLSD - 708/1136
scraping M

scraping HUYA - 1001/1136
scraping EQH - 1002/1136
scraping EVLO - 1003/1136
scraping OBNK - 1004/1136
scraping ROAD - 1005/1136
scraping STXB - 1006/1136
scraping ASLN - 1007/1136
scraping BCML - 1008/1136
scraping CBLK - 1009/1136
scraping INSP - 1010/1136
scraping UBX - 1011/1136
scraping PRT - 1012/1136
scraping DOMO - 1013/1136
scraping BJ - 1014/1136
scraping BV - 1015/1136
scraping ENTX - 1016/1136
scraping EVER - 1017/1136
scraping FTSV - 1018/1136
scraping STIM - 1019/1136
scraping TBIO - 1020/1136
scraping TCDA - 1021/1136
scraping UXIN - 1022/1136
scraping NTGN - 1023/1136
scraping LOVE - 1024/1136
scraping HYRE - 1025/1136
scraping ECOR - 1026/1136
scraping AUTL - 1027/1136
scraping APTX - 1028/1136
scraping IIIV - 1029/1136
scraping KZR - 1030/1136
scraping MGTA - 1031/1136
scraping AVRO - 1032/1136
scraping XERS - 1033/1136
scraping EIDX - 1034/1136
scraping VRCA - 1035/1136
scraping NEW - 1036/1136
scraping AVLR - 1037/1136
scraping CHRA - 1038/1136
scraping USX - 1039/1

In [622]:
employees2019_clean = []
for emp in employees2019:
    if emp is not np.nan:
        if len(emp) > 0:
            emp = int(emp.replace(',', ''))
        else:
            emp = np.nan
    employees2019_clean.append(emp)

In [623]:
CEO_pay_clean = []
for i, cp in enumerate(CEO_pay):
    if cp == cp:
        if isinstance(cp, float):
            print(i, cp)
            cp_clean = cp*1000
        elif 'M' in cp:
            print(i, cp)
            cp_clean = float(cp.replace("M", ""))*1000000
        elif 'k' in cp:
            print(i, cp)
            cp_clean = float(cp.replace("k", ""))*1000        
        CEO_pay_clean.append(cp_clean)
    elif cp != cp:
        print(i, 'this one', cp)
        CEO_pay_clean.append(cp)

0 this one nan
1 2.32M
2 1.65M
3 816.15k
4 799.5k
5 1.91M
6 this one nan
7 10.88M
8 this one nan
9 this one nan
10 586.84k
11 1.71M
12 500k
13 554.71k
14 770.17k
15 1.57M
16 715.3k
17 1.34M
18 1.15M
19 501.12k
20 1.39M
21 2.39M
22 3.17M
23 2.39M
24 1M
25 this one nan
26 1.2M
27 839.92k
28 821.3k
29 854.23k
30 this one nan
31 700.01k
32 1.84M
33 this one nan
34 1.88M
35 1.17M
36 619.25k
37 418.35k
38 this one nan
39 1.38M
40 4.45M
41 1.8M
42 1.52M
43 this one nan
44 1.03M
45 1.4M
46 2.11M
47 this one nan
48 this one nan
49 this one nan
50 this one nan
51 978.89k
52 this one nan
53 869.7k
54 7.36M
55 3.26M
56 this one nan
57 this one nan
58 419.16k
59 this one nan
60 6.01M
61 this one nan
62 461.26k
63 this one nan
64 13.69k
65 2.25M
66 1.47M
67 2.53M
68 2.22M
69 303.75k
70 856.64k
71 this one nan
72 3.1M
73 this one nan
74 1.57M
75 this one nan
76 1.82M
77 941k
78 908.96k
79 1.05M
80 1.68M
81 790.22k
82 429.17k
83 1.26M
84 638.34k
85 252.62k
86 375.72k
87 1.67M
88 1.64M
89 3.84M
90 this

1068 629.06k
1069 1.41M
1070 402.5k
1071 2.5M
1072 this one nan
1073 730.85k
1074 this one nan
1075 467.31k
1076 783.25k
1077 705.87k
1078 601.61k
1079 this one nan
1080 641.19k
1081 this one nan
1082 860.18k
1083 this one nan
1084 782.74k
1085 this one nan
1086 475.07k
1087 358.11k
1088 1.73M
1089 this one nan
1090 790.33k
1091 this one nan
1092 1.27M
1093 this one nan
1094 this one nan
1095 440.8k
1096 this one nan
1097 524.67k
1098 580k
1099 this one nan
1100 this one nan
1101 1.73M
1102 590.87k
1103 this one nan
1104 1.37M
1105 this one nan
1106 1.11M
1107 561.1k
1108 927.16k
1109 1.11M
1110 315k
1111 1.09M
1112 1.77M
1113 706.49k
1114 818.4k
1115 727.54k
1116 808.44k
1117 this one nan
1118 this one nan
1119 this one nan
1120 this one nan
1121 this one nan
1122 521.04k
1123 493.02k
1124 444.05k
1125 this one nan
1126 this one nan
1127 2.63M
1128 this one nan
1129 this one nan
1130 50k
1131 this one nan
1132 606.48k
1133 492.4k
1134 2.67M
1135 this one nan


# Final DataFrame

In [647]:
df_ipo['sector'] = sector
df_ipo['industry'] = industry
df_ipo['employees2019'] = employees2019_clean
df_ipo['CEO_pay'] = CEO_pay_clean
df_ipo['CEO_born'] = CEO_born

In [625]:
df_ipo.head()

Unnamed: 0,Company Name,Symbol,Market,Price,Shares,Offer Amount,Date Priced,employees,address,US_state,...,inmonth_volume,inweek_adjclose,inweek_open,inweek_spread,inweek_volume,sector,industry,employees2019,CEO_pay,CEO_born
AMCF,ANDATEE CHINA MARINE FUEL SERVICES CORP,AMCF,NASDAQ,6.3,3134921.0,19750002.0,2010-01-26,128.0,NO. 68 BINHAI RD DALIAN XIGANG DISTRICTDALIAN ...,,...,323300.0,5.83,6.03,0.28,106300.0,Energy,Oil & Gas Equipment & Services,189.0,,
CHSP,CHESAPEAKE LODGING TRUST,CHSP,New York Stock Exchange,20.0,7500000.0,150000000.0,2010-01-22,3.0,"4300 WILSON BOULEVARDSUITE 625ARLINGTON, VA 22203",VA,...,40500.0,11.927483,18.91,0.139999,75900.0,Real Estate,REIT - Hotel & Motel,13.0,2320000.0,1962.0
GNRC,GENERAC HOLDINGS INC.,GNRC,New York Stock Exchange,13.0,18750000.0,243750000.0,2010-02-11,1486.0,"S45 W29290 HIGHWAY 59WAUKESHA, WI 53187",WI,...,265400.0,8.612183,13.05,0.29,110000.0,Industrials,Diversified Industrials,5046.0,1650000.0,1972.0
QNST,"QUINSTREET, INC",QNST,NASDAQ,15.0,10000000.0,150000000.0,2010-02-11,568.0,"950 TOWER LANE, 6TH FLOORFOSTER CITY, CA 94404",CA,...,27300.0,12.98,13.5,0.809999,504000.0,Technology,Internet Content & Information,506.0,816150.0,1960.0
TRNO,TERRENO REALTY CORP,TRNO,New York Stock Exchange,20.0,8750000.0,175000000.0,2010-02-10,6.0,"16 MAIDEN LANEFIFTH FLOORSAN FRANCISCO, CA 94108",CA,...,64300.0,14.770803,18.809999,0.199999,91200.0,Real Estate,REIT - Industrial,23.0,799500.0,1961.0


In [648]:
df_ipo.shape

(1136, 30)

In [670]:
df_ipo.to_csv('ipo_stock_2010_2018.csv', index=False)