## 1. Importing Python Libraries

We shall start by importing the essential Python libraries.

In [1]:
### IMPORTING LIBRARIES
import numpy as np
import pandas as pd

## 2. Pulling Data from the URL 

Next, let us take the url of the wikipedia page that contains the table of all the songs recorded by The Beatles and import it using pandas.

In [2]:
### IMPORTING URL DATA
url = 'https://en.wikipedia.org/wiki/List_of_songs_recorded_by_the_Beatles'
song_list = pd.read_html(url)

Now, since the third table in the webpage contains the list of songs, let us make a pandas dataframe using the third table from the _song_list_ object.  

In [3]:
song_list = pd.DataFrame(song_list[2]) 

## 3. Data Exploration

Let us check the table.

In [4]:
song_list.head(10)

Unnamed: 0,Song,Core catalogue release(s),Songwriter(s),Lead vocal(s)[d],Year,Ref(s)
0,"""Across the Universe""[e]",Let It BePast Masters,LennonMcCartney,Lennon,1969,[50][51]
1,"""Act Naturally""",Help!,Johnny RussellVoni Morrison,Starr,1965,[52]
2,"""All I've Got to Do""",With the Beatles,LennonMcCartney,Lennon,1963,[53]
3,"""All My Loving""",With the Beatles,LennonMcCartney,McCartney,1963,[53]
4,"""All Together Now""",Yellow Submarine,LennonMcCartney,McCartney(with Lennon),1969,[54]
5,"""All You Need Is Love""[f] #",Magical Mystery Tour,LennonMcCartney,Lennon,1967,[55][56]
6,"""And I Love Her""",A Hard Day's Night,LennonMcCartney,McCartney,1964,[57]
7,"""And Your Bird Can Sing""",Revolver,LennonMcCartney,Lennon,1966,[58]
8,"""Anna (Go to Him)""",Please Please Me,Arthur Alexander,Lennon,1963,[59]
9,"""Another Girl""",Help!,LennonMcCartney,McCartney,1965,[52]


We see that although the dataframe contains the list of songs, all the song titles are written within inverted commas. Also, some of the titles have reference links in them. Further, the dataframe also contains a column which contains only reference numbers.

Next, let us check the dimensions of this dataframe.

In [5]:
song_list.shape

(213, 6)

So, the dataframe contains the title of all the 213 songs recorded by The Beatles.

## 4. Cleaning the Titles Dataframe

Let us start cleaning the dataframe. First, we drop the column containing reference numbers.

In [6]:
### DROPPING UNNECESSARY COLUMNS
song_list.drop(['Ref(s)'], axis = 1, inplace = True)

Next, let us try to extract clean song titles. For this, we split each title using the punctuation _"_ and then take the second portion of each split.

In [7]:
### CLEANING THE SONG TITLES
title = []
for i in range(0, len(song_list)):
    song_title = song_list['Song'][i]
    song_title_clean = song_title.split('"')[1]
    title.append(song_title_clean)

Now, we replace the song title with the clean titles.

In [8]:
### REPLACING TITLES WITH CLEANED ONES
song_list['Song'] = title
song_list.head(10)

Unnamed: 0,Song,Core catalogue release(s),Songwriter(s),Lead vocal(s)[d],Year
0,Across the Universe,Let It BePast Masters,LennonMcCartney,Lennon,1969
1,Act Naturally,Help!,Johnny RussellVoni Morrison,Starr,1965
2,All I've Got to Do,With the Beatles,LennonMcCartney,Lennon,1963
3,All My Loving,With the Beatles,LennonMcCartney,McCartney,1963
4,All Together Now,Yellow Submarine,LennonMcCartney,McCartney(with Lennon),1969
5,All You Need Is Love,Magical Mystery Tour,LennonMcCartney,Lennon,1967
6,And I Love Her,A Hard Day's Night,LennonMcCartney,McCartney,1964
7,And Your Bird Can Sing,Revolver,LennonMcCartney,Lennon,1966
8,Anna (Go to Him),Please Please Me,Arthur Alexander,Lennon,1963
9,Another Girl,Help!,LennonMcCartney,McCartney,1965


## 5. Saving the Titles Dataframe

Lastly, we save this dataframe in csv format.

In [9]:
song_list.to_csv('beatles_song_list.csv', index = False)