# **Web Scraping**


## Objectives


* Extract information from a given web site 
* Scrape the names of the programming languages & average salary
* Write the scraped data into a csv file


## Extract information from the given web site


In [2]:
# URL to scrape
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DA0321EN-SkillsNetwork/labs/datasets/Programming_Languages.html"

Import the required libraries


In [3]:
from bs4 import BeautifulSoup
import requests
import pandas as pd

Download the webpage at the url


In [4]:
data = requests.get(url).text

Create a soup object


In [5]:
soup = BeautifulSoup(data, 'html.parser')

Scrape the `Language name` and `annual average salary`.


In [7]:
# Lists to store the scaped data
languages = []
salaries = []

# Loop to scrape the rows in the table (<tr> tag represents a table row in HTML)
for row in soup.find_all('tr'):
    
    # Each column cell is represented by <td>
    col = row.find_all('td') 

    # Store the text values of the languages and salaries column
    language_name = col[1].get_text()
    avg_salary = col[3].get_text()

    # Add the values to the lists
    languages.append(language_name)
    salaries.append(avg_salary)

print(languages)
print(salaries)

['Language', 'Python', 'Java', 'R', 'Javascript', 'Swift', 'C++', 'C#', 'PHP', 'SQL', 'Go']
['Average Annual Salary', '$114,383', '$101,013', '$92,037', '$110,981', '$130,801', '$113,865', '$88,726', '$84,727', '$84,793', '$94,082']


Store the data into a Pandas dataframe

In [None]:
df = pd.DataFrame({languages[0] : languages[1:], salaries[0]: salaries[1:]})
df

Unnamed: 0,Language,Average Annual Salary
0,Python,"$114,383"
1,Java,"$101,013"
2,R,"$92,037"
3,Javascript,"$110,981"
4,Swift,"$130,801"
5,C++,"$113,865"
6,C#,"$88,726"
7,PHP,"$84,727"
8,SQL,"$84,793"
9,Go,"$94,082"


Save the dataframe into a csv file

In [18]:
df.to_csv("popular-languages.csv", index=False)

<h2>Note:</h2>

<p>This is a modified version of the original IBM Lab. Many things were removed to make it easier to understand and follow along.</p>