# Introduction/Business Problem

## Problem Statement
Tokyo is a city that is booming with people and prospective businesses. There are a lot of workers, tourists and residents who need to eat in the top 5 busiest neighbourhoods in Toyko. It would make sense for a cafe to open, but we want to know where would be the best neighborhood to open a restaurant through understanding benefits and hinderance of opening a restaurant in these neighborhoods.

We will be looking specifically at how much the place would cost to determine if it is a good location.  A good profit for a cafe to make would be about a 20% or above profit. 

## Target Audience 

People who would be interested in understanding the results from our analysis include people who would like to open a cafe, tourists, office workers, locals or students in the area. These people will get the most benefit from this data. Business owners will get the most money and understand the pros or cons of the area and customers will be able to have a new cafe that is close to them.

# Data

This data is used in order to find the top 5 cities with their relating population and area. From there we can continue to week 2 and find out more information. The source of this data is from wikipedia. 

In [154]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
import json
from pandas.io.json import json_normalize

!pip install folium
import folium
from geopy.geocoders import Nominatim
import requests



In [28]:
!pip install geopy



### Scraping wiki cities

In [102]:
df_wiki=pd.read_html('https://en.wikipedia.org/wiki/Special_wards_of_Tokyo')[3]
dfwiki=df_wiki.drop(columns=['No.','Flag','Kanji','Density(/km2)','Major districts'],axis=1)
dfwik=dfwiki.iloc[0:5]
print(dfwik)

       Name  Population(as of October 2016  Area(km2)
0   Chiyoda                          59441      11.66
1      Chūō                         147620      10.21
2    Minato                         248071      20.37
3  Shinjuku                         339211      18.22
4    Bunkyō                         223389      11.29


In [8]:
! pip install -U googlemaps

Collecting googlemaps
  Downloading https://files.pythonhosted.org/packages/e8/0c/5f84b84b1b73c4710fe0b9fa062f5afe873013c7a2f2141127fd1939359c/googlemaps-3.1.1-py3-none-any.whl
Installing collected packages: googlemaps
Successfully installed googlemaps-3.1.1


### Finding latitude and longitude of cities.

In [30]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="python")
location = geolocator.geocode("Chiyoda")
print((location.latitude, location.longitude))

(35.6938097, 139.7532163)


In [31]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="python")
location = geolocator.geocode("Chūō")
print((location.latitude, location.longitude))

(35.666255, 139.775565)


In [33]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="python")
location = geolocator.geocode("Minato")
print((location.latitude, location.longitude))

(35.6432274, 139.7400553)


In [34]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="python")
location = geolocator.geocode("Shinjuku")
print((location.latitude, location.longitude))

(35.6937632, 139.7036319)


In [35]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="python")
location = geolocator.geocode("Bunkyō")
print((location.latitude, location.longitude))

(35.71881, 139.744732)


### Table of cities, lat and long.

In [79]:
from pandas import DataFrame
dfmap={'Name':['Chiyoda','Chūō','Minato','Shinjuku','Bunkyō'],'Latitude': ['35.6938097','35.666255','35.6432274','35.6937632','35.71881'],'Longitude':['139.7532163','139.775565','139.7400553','139.7036319','139.744732']}
df=DataFrame(dfmap,columns=['Name','Latitude','Longitude'])
df

Unnamed: 0,Name,Latitude,Longitude
0,Chiyoda,35.6938097,139.7532163
1,Chūō,35.666255,139.775565
2,Minato,35.6432274,139.7400553
3,Shinjuku,35.6937632,139.7036319
4,Bunkyō,35.71881,139.744732


In [150]:
dftotal =pd.merge(dfwik, df, on='Name',how='outer')
dftotal

Unnamed: 0,Name,Population(as of October 2016,Area(km2),Latitude,Longitude
0,Chiyoda,59441,11.66,35.6938097,139.7532163
1,Chūō,147620,10.21,35.666255,139.775565
2,Minato,248071,20.37,35.6432274,139.7400553
3,Shinjuku,339211,18.22,35.6937632,139.7036319
4,Bunkyō,223389,11.29,35.71881,139.744732


In [67]:
display('dfwik','df',"pd.merge(dfwik, df, left_on='Name',right_on='Latitude')")

'dfwik'

'df'

"pd.merge(dfwik, df, left_on='Name',right_on='Latitude')"

### Scrapping land price values

In [104]:
df_land=pd.read_html('https://utinokati.com/en/details/land-market-value/area/Tokyo/')[0]
dfland=df_land.drop(columns=['Average Trading Price'],axis=1)
dfland=dfland.iloc[0:5]
print(dfland)


          Area  Average Unit Price
0   Chiyoda-Ku  1,827,610 JPY/sq.m
1      Chuo-Ku  3,222,564 JPY/sq.m
2    Minato-Ku  2,253,006 JPY/sq.m
3  Shinjuku-Ku    915,879 JPY/sq.m
4    Bunkyo-Ku    957,330 JPY/sq.m


In [131]:
dftotal

Unnamed: 0,Name,Population(as of October 2016,Area(km2),Latitude,Longitude,Average Prices,Average Price
0,Chiyoda,59441,11.66,35.6938097,139.7532163,"1,827,610 JPY/sq.m","1,827,610 JPY/sq.m"
1,Chūō,147620,10.21,35.666255,139.775565,"3,222,564 JPY/sq.m","3,222,564 JPY/sq.m"
2,Minato,248071,20.37,35.6432274,139.7400553,"2,253,006 JPY/sq.m","2,253,006 JPY/sq.m"
3,Shinjuku,339211,18.22,35.6937632,139.7036319,"915,879 JPY/sq.m","915,879 JPY/sq.m"
4,Bunkyō,223389,11.29,35.71881,139.744732,"957,330 JPY/sq.m","957,330 JPY/sq.m"


In [141]:
dft=dftotal.drop(columns='Average Prices',axis=1)
dft

Unnamed: 0,Name,Population(as of October 2016,Area(km2),Latitude,Longitude,Average Price
4,Bunkyō,223389,11.29,35.71881,139.744732,"957,330 JPY/sq.m"
3,Shinjuku,339211,18.22,35.6937632,139.7036319,"915,879 JPY/sq.m"
1,Chūō,147620,10.21,35.666255,139.775565,"3,222,564 JPY/sq.m"
2,Minato,248071,20.37,35.6432274,139.7400553,"2,253,006 JPY/sq.m"
0,Chiyoda,59441,11.66,35.6938097,139.7532163,"1,827,610 JPY/sq.m"


## Methodology

Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.
In this report, I was able to find the most populated cities and then from there find out which ones were the most expensive. The ones that are more expensive can be an advantage or disadvantage depending on if more people go to the cafe because there or more people, or if the building is too expensive to up keep. We used scrapping data and forming tables in order to best understand the data. 

## Results/Discussion

I observed that there isn't not an apparent trend between population size, price or area.
Shinjuku has the lowest price and the largest population. However, it has the second largest area which means not as many people might come to their shop.  The results show that Chuo has the highest price but the smallest area and the second to lowest population. Because of that I would not recommend moving to Chuo because the cafe would likely not do well. 
Shinjuku has the lowest price and the second highest population. I would recommend opening a cafe there because of its reasonable price and the large population.  

## Conclusion 

In conclusion, according to the found data, I would recommend opening a cafe in Shinjuku. It has the lowest price along with the second to highest population. This means that there are more people in the area and your rent will be much lower. This could allow you to make the choices of making your drinks cheaper in order to try and get more customers and make even more of a profit.