# Open Pub Application


Let’s assume you are on a vacation in the United Kingdom with your friends. Just for some fun, you decided to go to the Pubs nearby for some drinks. Google Map is down because of some issues. 

While searching the internet, you came across https://www.getthedata.com/open-pubs. On this website, you found all the pub locations (Specifically Latitude and Longitude info). In order to impress your friends, you decided to create a web application with the data available in your hand. (Go through the requirements mentioned below)


In [1]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt

In [2]:
df1=pd.read_csv('open_pubs.csv')
df1.head()

Unnamed: 0,22,Anchor Inn,"Upper Street, Stratford St Mary, COLCHESTER",CO7 6LW,604749,234404,51.970379,0.979340,Babergh
0,36,Ark Bar Restaurant,"Ark Bar And Restaurant, Cattawade Street, Bran...",CO11 1RH,610194,233329,51.958698,1.057832,Babergh
1,74,Black Boy,"The Lady Elizabeth, 7 Market Hill, SUDBURY, Su...",CO10 2EA,587334,241316,52.038595,0.729915,Babergh
2,75,Black Horse,"Lower Street, Stratford St Mary, COLCHESTER",CO7 6JS,622675,-5527598,\N,\N,Babergh
3,76,Black Lion,"Lion Road, Glemsford, SUDBURY",CO10 7RF,622675,-5527598,\N,\N,Babergh
4,97,Brewers Arms,"The Brewers Arms, Bower House Tye, Polstead, C...",CO6 5BZ,598743,240655,52.028694,0.895650,Babergh


In [3]:
data_labels=pd.read_csv('data_dictionary.csv')
data_labels.head()

Unnamed: 0,Field,Possible Values,Comments
0,fsa_id,int,Food Standard Agency's ID for this pub.
1,name,string,Name of the pub.
2,address,string,Address fields separated by commas.
3,postcode,string,Postcode of the pub.
4,easting,int,


In [4]:
data_labels.iloc[:,0]

0             fsa_id
1               name
2            address
3           postcode
4            easting
5           northing
6           latitude
7          longitude
8    local_authority
Name: Field, dtype: object

In [19]:
df2=df1.copy()
df2.columns = [i for i in data_labels.iloc[:,0]]
df2.to_csv("df2.csv")
df2.head()

Unnamed: 0,fsa_id,name,address,postcode,easting,northing,latitude,longitude,local_authority
0,36,Ark Bar Restaurant,"Ark Bar And Restaurant, Cattawade Street, Bran...",CO11 1RH,610194,233329,51.958698,1.057832,Babergh
1,74,Black Boy,"The Lady Elizabeth, 7 Market Hill, SUDBURY, Su...",CO10 2EA,587334,241316,52.038595,0.729915,Babergh
2,75,Black Horse,"Lower Street, Stratford St Mary, COLCHESTER",CO7 6JS,622675,-5527598,\N,\N,Babergh
3,76,Black Lion,"Lion Road, Glemsford, SUDBURY",CO10 7RF,622675,-5527598,\N,\N,Babergh
4,97,Brewers Arms,"The Brewers Arms, Bower House Tye, Polstead, C...",CO6 5BZ,598743,240655,52.028694,0.895650,Babergh


In [6]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51330 entries, 0 to 51329
Data columns (total 9 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   fsa_id           51330 non-null  int64 
 1   name             51330 non-null  object
 2   address          51330 non-null  object
 3   postcode         51330 non-null  object
 4   easting          51330 non-null  int64 
 5   northing         51330 non-null  int64 
 6   latitude         51330 non-null  object
 7   longitude        51330 non-null  object
 8   local_authority  51330 non-null  object
dtypes: int64(3), object(6)
memory usage: 3.5+ MB


In [11]:
df3=df2.copy()
df3= df3.replace("\\N", np.nan)

In [13]:
df3.isnull().sum()

fsa_id               0
name                 0
address              0
postcode             0
easting              0
northing             0
latitude           767
longitude          767
local_authority      0
dtype: int64

Latitude and Longitude columns have some null values in them .

In [14]:
df4=df3.copy()
df4.dropna(axis=0, inplace=True, how='any')

In [15]:
df4['latitude']=(df4['latitude']).astype(float)
df4['longitude']=(df4['longitude']).astype(float)

In [16]:
df4.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 50563 entries, 0 to 51329
Data columns (total 9 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   fsa_id           50563 non-null  int64  
 1   name             50563 non-null  object 
 2   address          50563 non-null  object 
 3   postcode         50563 non-null  object 
 4   easting          50563 non-null  int64  
 5   northing         50563 non-null  int64  
 6   latitude         50563 non-null  float64
 7   longitude        50563 non-null  float64
 8   local_authority  50563 non-null  object 
dtypes: float64(2), int64(3), object(4)
memory usage: 3.9+ MB


In [17]:
df4.head()

Unnamed: 0,fsa_id,name,address,postcode,easting,northing,latitude,longitude,local_authority
0,36,Ark Bar Restaurant,"Ark Bar And Restaurant, Cattawade Street, Bran...",CO11 1RH,610194,233329,51.958698,1.057832,Babergh
1,74,Black Boy,"The Lady Elizabeth, 7 Market Hill, SUDBURY, Su...",CO10 2EA,587334,241316,52.038595,0.729915,Babergh
4,97,Brewers Arms,"The Brewers Arms, Bower House Tye, Polstead, C...",CO6 5BZ,598743,240655,52.028694,0.89565,Babergh
5,102,Bristol Arms,"Bristol Hill, Shotley, IPSWICH",IP9 1PU,624624,233550,51.955042,1.267642,Babergh
6,122,Caffeine Lounge,"14 Borehamgate Shopping Precinct, King Street,...",CO10 2ED,587527,241247,52.037903,0.732687,Babergh


In [18]:
df4.to_csv("updated_pubdata.csv")