## Step 1: Access the Web Page
Accessing a webpage is as easy as typing a URL on a browser.
Only this time, we have to remove the human element in the process. We can use requests to do this.

In [1]:
import requests

In [2]:
url = "http://eoddata.com/stocklist/NASDAQ/A.htm"
page = requests.get(url)

## Step 2: Locate and parse the items to be scraped
we can use BeautifulSoup to locate and parse HTML items. It’s often used together with the request library.

In [4]:
from bs4 import BeautifulSoup

In [5]:
soup = BeautifulSoup(page.text, 'html.parser')

In [8]:
print(soup.prettify())

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <link href="../../styles/jquery-ui-1.10.0.custom.min.css" rel="stylesheet" type="text/css"/>
  <link href="../../styles/main.css" rel="stylesheet" type="text/css"/>
  <link href="../../styles/button.css" rel="stylesheet" type="text/css"/>
  <link href="../../styles/nav.css" rel="stylesheet" type="text/css"/>
  <script src="/scripts/jquery-1.9.0.min.js" type="text/javascript">
  </script>
  <script src="/scripts/jquery-ui-1.10.0.custom.min.js" type="text/javascript">
  </script>
  <script type="text/javascript">
   var _sf_startpt = (new Date()).getTime()
  </script>
  <script src="scripts/jquery-1.4.2.min.js" type="text/javascript">
  </script>
  <meta content="list of symbols for NASDAQ Stock Exchange,list of stock symbols,download symbols,stock symbols list,NASDAQ symbol list,NASDAQ stock ticker,NASDAQ stock li

In [9]:
elements = []
table = soup.find('div', {'id':'ctl00_cph1_divSymbols'})
for tr in table.find_all('tr'):
    for td in tr.find_all('td'):
        element = td.text
        elements.append(element)
x = len(elements)

symbol = []
for y in range(0, x, 10):
    symbol.append(elements[y])

names = []
for y in range(1, x, 10):
    names.append(elements[y])

## Step 3: Save scraped items on a file
You can see that we stored the stock symbol and names in a list called symbol and names respectively. From here we can use the pandas library to put these lists on a dataframe and output them as a JSON file.

In [10]:
import pandas as pd

In [11]:
df = pd.DataFrame(index = None)
df['stock_symbol'] = symbol
df['stock_name'] = names
df.head()

Unnamed: 0,stock_symbol,stock_name
0,AACG,Ata Creativity Global
1,AACQ,Artius Acquisition Inc Cl A
2,AACQU,Artius Acquisition Inc Unit
3,AACQW,Artius Acquisition Inc WT
4,AAL,American Airlines Gp


In [12]:
df.set_index('stock_symbol', inplace = True)

In [13]:
df.head()

Unnamed: 0_level_0,stock_name
stock_symbol,Unnamed: 1_level_1
AACG,Ata Creativity Global
AACQ,Artius Acquisition Inc Cl A
AACQU,Artius Acquisition Inc Unit
AACQW,Artius Acquisition Inc WT
AAL,American Airlines Gp


In [14]:
df.to_json('NASDAQ Stock List')