# Project 6 - Boat & Yacht Sales

Importing Visualization Libraries and Data
---

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import os
import folium
import json

In [2]:
# create a path
path = r'C:\Users\niels\Documents\Sales Boat\Data Source'

In [3]:
# load data
df = pd.read_csv(os.path.join(path,'boat_data_cleaned_V2.csv'), index_col = False)

Load the geojson data for european country boundaries
---

In [4]:
europe_json = json.load(open("europe.geo.json", "r", encoding="utf-8"))

Data wrangling to show accurate data
---

In [5]:
df = pd.read_csv(os.path.join(path,'boat_data_cleaned_V2.csv'), index_col = False)
country_counts = df.Country.value_counts()
countries_to_remove = country_counts[country_counts <= 1]
df = df[~df.Country.isin(countries_to_remove.index)]

Mapping the country name in our dataset to the name contained in the json file
---

In [6]:
df.Country = df.Country.apply(lambda s: s.replace("Croatia (Hrvatska)", "Croatia").replace("Slovak Republic", "Slovakia"))

df_countries = set(df.Country.unique())
json_countries = set([feature["properties"]["name"] for feature in europe_json['features']])
df_countries.difference(json_countries)

{'Malta', 'Turkey'}

Country name differences between the 2 datasets
---

In [7]:
m = folium.Map(location=[46.8155135,8.224471999999992],zoom_start=5, width=800, height=500, control_scale=True)
choropleth_nb = folium.Choropleth(
    geo_data = europe_json,
    data=df,
    columns = ['Country', 'Number of views last 7 days'],
    key_on = 'feature.properties.name',
    fill_color = "Reds",
    fill_opacity = .7,
    line_opacity = .5,
    legend_name = 'Number of views last 7 days',
    highlight = True,
    name='choropleth',
    nan_fill_color="rgba(0, 0, 0, 0)"
    
).add_to(m)
m

### Does the analysis answer any of your existing research questions?

This analysis shows us in which country have the most views these last 7 days. It's telling us important information regarding the location of boats that are interesting potential customers. This is a key indicator for the sales team.

### Does the analysis lead you to any new research questions?

This analysis could lead to question like why Italy has the most viewed boats the last 7 days ?

In [8]:
m = folium.Map(location=[46.8155135,8.224471999999992],zoom_start=5, width=800, height=500, control_scale=True)
choropleth_price = folium.Choropleth(
    geo_data = europe_json,
    data=df,
    columns = ['Country', 'EUR_price'],
    key_on = 'feature.properties.name',
    fill_color = "Reds",
    fill_opacity = .8,
    line_opacity = .5,
    legend_name = 'Price (EUR)',
    highlight = True,
    name='choropleth',
    nan_fill_color="rgba(0, 0, 0, 0)"
    
).add_to(m)
m

### Does the analysis answer any of your existing research questions?

This analysis shows us in which country have the most expensive boats are located. It give us relevant information about the prices all over the market that is a key topics either for customers or sales team.

### Does the analysis lead you to any new research questions?

This analysis could lead to question like why Greece and Sweden has the most expensive boats in the market ?

In [9]:
m = folium.Map(location=[46.8155135,8.224471999999992],zoom_start=5, width=800, height=500, control_scale=True)
choropleth_price = folium.Choropleth(
    geo_data = europe_json,
    data=df,
    columns = ['Country', 'Year Built'],
    key_on = 'feature.properties.name',
    fill_color = "Reds",
    fill_opacity = .8,
    line_opacity = .5,
    legend_name = 'Year Built',
    highlight = True,
    name='choropleth',
    nan_fill_color="rgba(0, 0, 0, 0)"
    
).add_to(m)
m

### Does the analysis answer any of your existing research questions?

We saw previously that potential customers are interested in New boats in stock. This analysis shows us in which country newest boats are located. It's answering one of our first question telling us that most of the new Boats are located in Germany, Switzerland, Italy, Netherlands, Poland and Croatia.

### Does the analysis lead you to any new research questions?

This analysis could lead to question like what country has the biggest quantity of new boats ?

In [10]:
m = folium.Map(location=[46.8155135,8.224471999999992],zoom_start=5, width=800, height=500, control_scale=True)
choropleth_price = folium.Choropleth(
    geo_data = europe_json,
    data=df,
    columns = ['Country', 'Length'],
    key_on = 'feature.properties.name',
    fill_color = "Reds",
    fill_opacity = .8,
    line_opacity = .5,
    legend_name = 'Length of the boats',
    highlight = True,
    name='choropleth',
    nan_fill_color="rgba(0, 0, 0, 0)"
    
).add_to(m)
m

### Does the analysis answer any of your existing research questions?

This analysis shows us in which country have boat with more length are located. It is interesting to see that Greece is on top of the chart here followed by Sweden and then Spain, Belgium, and Denmark. It is not answering any of our question but it's a relevant data to have in case we would like to share this data to the sales team in case they would like to segment the market.

### Does the analysis lead you to any new research questions?

This analysis could lead to question like where are biggest or smallest boats located in the market ?
