#### Machine Learning for Business - Data Camp

- ML is applying statistical /computer science methods to data to
1. draw causal insights
    - what is causing customers to cancel subscription?
2. predict future events
    - which customers are likely to cancel their subscription
3. understand patterns
    - are there groups of customers who are similar
    
Data Heirarchy of Needs
1. collection - extract data from source
2. storage - reliable storage
3. preparation - organize and clean data to make it usable
4. analysis - understand trends, distributions, segments
5. prototyping and testing ML - interpretable simple models, tests/experiments
6. ML in production - complex models in production/research/automation

Focus on 5 and 6 in this course

#### Supervised vs. Unsupervised ML
- Supervised - draw casual insights, predict future events
- Unsupervised - understand patterns in data

Supervised ML data structure:
- Target variable - what we want to predict ex: fraud probability
- Input features - list of columns with data ex: transaction data
- SML uses the input features to predict the target variable

Unsupervised ML data structure:
- Uses input features (data points) to segment data

Ex: Marketing
- SML used to predict which cust. are likely to purchase next month, predict cust. lifetime value
- USL used to group customers into segments based on past purchases


#### Job roles, tools, and technologies

Roles/Tools/Tech for Data Heirarchy of Needs
1. collection - infrastructure owners
2. storage - data engineers, database administrators
3. preparation - data engineers / data analysts
4. analysis - data analysts / data scientists
5. prototyping and testing ML - data scientists / ML engineers
6. ML in production - ML engineers

Team Structure
1. centralized - all data functions in one central team (small scale, maintains focus)
2. decentralized - each business unit has own data function (can cause issues in overlap and silohs
3. hybrid - infractructure, definitions, methods, tooling are centralized and application and prototyping is decentralized


### Chapter 2
#### Prediction vs. Inference Dilemma

Inference or causal models 
- the goal is to understand the drivers of a business outcome
- interpretable - easy to understand
- less accurate
- Ex: What are the main drivers of fraud? How much does condition X impact risk? What is the cause? what are the effects of conditions?

Prediction
- prediction itself is the main goal
- not easily interpretable, black box
- Ex: which transactions are likely fraudulent? Is the patient at risk of condition x? predictions based on variables?

There is a trade-off between accuracy and interpretability: some models are easy to understand but do not have optimal predictions; others have excellent predictions but it's unclear how they got there.

Modeling data structure
- Ex: fraud probability, inference model is concerned with which input features affect fraud probability; prediction model is only concerned with accurately predicting fraud risk

#### Inference (causal) Models

What is causality 
- identify causal relationship of how much a certain action affects an outcome of interest
- answers the why question
- optimizes for model interpretability vs. performance
- models try to detect patterns in observed data and draw causal conclusions

Experiments vs. Observations
- experiments are designed and causal conclusions are guaranteed e.g. A/B tests
- when experiments are impossible, the models (observational studies) are used to calculate effect of inputs on outcomes
- experiments are preferred over observational studies

Inference model example
- data set of customers with spending observations (input features) and a target variable of next month spending
- how much do the input variables affect the target
- run a model to learn the prediction rules
- regression coefficients for each varaiable tell us how much the input and target are related and wether +/-
- ex: last month spending reg coef = 0.58 so the customers who spent on average 1USD more in the last month will spend 0.58 more in the next month compared with customers spending 1USD less last month

#### Prediction Models (Supervised Learning)
- Classification: Predicting class/type of an outcome
- Regression: predicting quantity of an outcome

Supervised - Classification Model
- target variable is categorical (discrete) (class of outcome)
- Ex: will customer cancel subscription? Is transaction fraudulent? 


Supervised - Regression Model
- target variable is continuous (continuous) (amount of outcome)
- Number of product purchases? Dollars spent?

#### Prediction Models (Unsupervised)
- Clustering: grouping observatios
- Anomaly detection - detecting which observations fall out of the discovered "regular pattern" and use it as an input in Supervised Learning or business input
- recommender engines - eg Netflix movie recommendations

Clustering example 
- segmentation
- training
- discover clusters

### Chapter 3
#### Business requirements

1. What is the business situation (expansion, etc)
2. What is the business opportunity and how big is it? (identify the right markets)
3. What are the business actions we can take? (prioritize and investment)

Ex: 
- situation: customer churn
- opportunity: reduce churn rate by X%, resulting in Y USD revenue saved
- action: ifentify and improve churn drivers (advertising, cust service, web site, errors), identify customers at risk and introduce retention campaigns

Always start with interence questions:
- why has churn been increasing?
- which information indicates a potential fraud?

Build on inference with defined prediction questions:
- can we identify customers at risk of churning?
- can we flag potentially risky transactions?

Business opportunity
- size up the opportunity, Cost Benefit
- once you know the drivers of the outcome, how much will it cost changing them and what will be the value of doing that?
- run experiments with model predictions

Actionable Machine Learning
- look at historical levels
- run experiments to see if you can affect the predicted outcome
- if yes, calculate the opportiunity
- if no, collect more data, do more research, narrow the question

#### Model Training