### Business Questions
    G2M Strategy 
    
- Market Segmentation
- Product
- Buyer
- Routes to Market
***
**Pitfalls to Avoid** 
- Poor product market fit
- Oversaturation
***
**Methodologies**
- Funnel - Awareness, consideration, and decision stages of the customer’s journey
- Flywheel - Attracting, engaging, and delighting prospects, leads, and customers
***
**Components**
- **Product-Market Fit:** What problem(s) does your product solve?
- **Target Audience:** Who is experiencing the problem that your product solves? How much are they willing to pay for a solution? What are the pain points and frustrations that you can alleviate?
- **Competition and Demand:** Who already offers what you’re launching? Is there a demand for the product, or is the market oversaturated?
- **Distribution:** Through what mediums will you sell the product or service? A website, an app, or a third-party distributor?


# G2M Strategy - Model Development

**Prediction**

Given the results from our cleaned dataset, we have a good understanding of the data and the relationships that exist, and the insights gather will assist in determining the type of model in predicting future `profit`.

Analysis will consist of traditional hypothesis testing to more advanced modeling and techniques.


Time period of data is from **31/01/2016 to 31/12/2018**:


**Resources:**<br>
[G2M Strategy](https://blog.hubspot.com/sales/gtm-strategy)

### Building a G2M Strategy

1. Identify the buying center and personas.
2. Craft a value matrix to help identify messaging.
3. Test your messaging.
4. Optimize your ads based on the results of your tests before implementing them on a wide scale.
5. Understand your buyer’s journey.
6. Choose one (or more) of the four most common sales strategies.
7. Build brand awareness and demand generation with inbound and/or outbound methods.
8. Create content to get inbound leads.
9. Find ways to optimize your pipeline and increase conversion rates.
10. Analyze and shorten the sales cycle.
11. Reduce customer acquisition cost.
12. Strategize ways to tap into your existing customer base.
13. Adjust and iterate as you go.
14. Retain and delight your customers.

As a company owner - Using the G2M strategy the focus will be considereably to that of its Customers/Users. Th

1) Can we tell how much would be earned by quarter, period, year of a specific time period? (Predictive analysis, Time Series)

2) What areas (city) generate greatest profit? (Using population and number of users as a reference)

3) Customer preference over pink and yellow?

4) Average age of user? (if applicable create age bins)

5) Average income of user? (hist, highest income user - states)

6) If company purchases a specific fleet of vehicles what is the time range for ROI? (Apply hypothesis testing and A/B Testing)

7) What are users/consumers using more (cash or card)? Are prices equivalent - Upcharge if cash, card transaction is pre-pickup - lower rate?

In [2]:
import os
import sys
import pandas as pd
import numpy as np

# Visualizations
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight') 
%matplotlib inline
import plotly.graph_objects as go
import plotly.express as px
from pylab import rcParams
from plotly import tools
import plotly
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff

# Time Series
from statsmodels.tsa import stattools as ts
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import acf, pacf
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.arima_model import ARIMAResults

# Sklearn
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score

# Function to display plotly in jupyter notebook
def enable_plotly_in_cell():
    import IPython
    from plotly.offline import init_notebook_mode
    display(IPython.core.display.HTML('''<script src="/static/components/requirejs/require.js"></script>'''))
    init_notebook_mode(connected=False)

In [3]:
cab_data = pd.read_csv('/Users/jasonrobinson/Desktop/VC/notebooks/cab_data_clean.csv')
cab_yellow = pd.read_csv('/Users/jasonrobinson/Desktop/VC/notebooks/yellow_cab.csv')
cab_pink = pd.read_csv('/Users/jasonrobinson/Desktop/VC/notebooks/pink_cab.csv')

In [7]:
cab_pink.head()

Unnamed: 0,travel_date,transact_id,company,city,km_travelled,price_charged,trip_cost,customer_id,payment_mode,gender,age,monthly_income,profit
0,2016-02-06,10000011,Pink Cab,ATLANTA GA,30.45,370.95,313.635,29290,Card,Male,28,10813,57.315
1,2016-02-04,10000012,Pink Cab,ATLANTA GA,28.62,358.52,334.854,27703,Card,Male,27,9237,23.666
2,2018-11-25,10395626,Pink Cab,ATLANTA GA,13.39,167.03,141.934,27703,Card,Male,27,9237,25.096
3,2016-01-31,10000013,Pink Cab,ATLANTA GA,9.04,125.2,97.632,28712,Cash,Male,53,11242,27.568
4,2016-02-05,10000014,Pink Cab,ATLANTA GA,33.17,377.4,351.602,28020,Cash,Male,23,23327,25.798


### Build on creating statistical analysis - answer business questions.

Perform analysis and provide visualizations to gather further insights and recommendations for the next steps in the process of determing optimal investment following the G2M strategy.

To achieve this we will apply machine learning techniques to predict future profit.

We wil also use the data to determine the optimal investment strategy, related to time series analysis. 
        

In [8]:
cab_pink['payment_mode'].unique()

array(['Card', 'Cash'], dtype=object)

In [None]:
cab_data[['female', 'male']] = pd.get_dummies(cab_data["gender"])
cab_data[['yellow_cab', 'pink cab']] = pd.get_dummies(cab_data["company"])
cab_data[['card', 'cash']] = pd.get_dummies(cab_data["payment_mode"])
cab_data = cab_data.drop(['gender', 'company', 'payment_mode', 'customer_id'], axis=1)

In [None]:
cab_data['monthly_income'].sort_values(ascending=False)[:3]

In [None]:
cab_data.sort_index(ascending=False)

In [None]:
cab_data.columns

In [None]:
#cab_data['travel_date']= pd.to_datetime(cab_data['travel_date'], infer_datetime_format=True)

In [None]:
# 5-day, 4-week work month
cab_data['daily_income'] = (cab_data['monthly_income'] / 20 ).round(2)

In [None]:
sns.displot(cab_data['monthly_income']);

In [None]:
sns.displot(cab_data['trip_cost']);

In [None]:
# Remove outliers above count of 5000
cab_data = cab_data[cab_data['trip_cost'] > 100]and
cab_data = cab_data[cab_data['trip_cost'] < 600]


In [None]:
# Percentage of monthly income on cost of trip
percentage_monthly_income = cab_data['trip_cost'] / cab_data['monthly_income']

## Classification

Another approach that can be taken with our dataset that will give us the ability to determine Users that use cash or card by conducting a classification using multivariate analysis.

In [None]:
# Majority class


In [None]:
# Split data into train, val, and test
train = cab_data.iloc[:int(len(cab_data) * 0.8)]
val = cab_data.iloc[int(len(cab_data) * 0.8):int(len(cab_data) * 0.9)]
test = cab_data.iloc[int(len(cab_data) * 0.9):]

In [None]:
target = 'trip_cost'
features = cab_data.columns.drop('trip_cost')

X_train = train[features]
y_train = train[target]

X_val = val[features]
y_val = val[target]

X_test = test[features]
y_test = test[target]