# Operator Probability

### Author: Spencer Smith

### Summary:

Can we reliably predict the probability that an operator's bid will be accepted to a given op?

### Hypothesis:

"New and Ready" operators struggle to be accepted to their first op (typical need-experience-to-get-experience problem) and will leave the platform out of frustration afer many failed bids. By suggesting ops to operators based on their bid acceptance probability, we can reduce operator frustration and decrease the likelihood that operators will quit prematurely.

### Value Proposition:

Increased revenue

***

## Feature Selection

This is a list of the features we are considering to model our problem and the justification for each feature:

## Target Feature: 

**is_accepted**: true/false (binary classification problem)

- note: probability will be extracted from above result

### Solid features:

##### op_is_autofill: true/false

- The autofill logic searches for bids (operators) with certain qualities, which means that an op's autofill status will affect the importance of other features belonging to an operator

##### operator_reliability_rating: decimal

- Range is currently between 1 and 5, algorithm might handle data better if range is scaled between 0 and 1
- This is purely speculation on my end, but the reliability rating looks to be more 'reliable' (no pun intended) than the operator's overall rating due to the noise generated from the operator quiz -- not to mention it's more visible

##### operator_is_in_business_labor_pool: true/false

- Operators that are "favorites" to a business will be first priority for openings

##### ops_completed_by_operator: whole number

- Operators that have completed more ops have more experience

##### has_schedule_conflict: true/false

- self-explanatory

### Possible, but questionable features:

##### accepted_bids_to_requested_bids_ratio: decimal

- I am skeptical of this feature, as it depends too much on the exact moment during which it was measured and will change constantly

##### unaccepted_ylp_operators_to_available_openings_ratio: decimal

- In short, if the op has two available openings, but three operators in the business ylp have not yet been accepted, will this affect operators who are not in the labor pool?
- Similar issue to above feature, this feature depends heavily on timing and will be difficult to capture

##### newness_of_business: ?

- I actually think this would be a great feature, as I imagine that businesses who have been using our platform longer will eventually "know" the operators they want without needing to depend on our platform to find them
- The question is how to best measure this feature. Number of ops posted is one option, as it does a fairly good job of capturing business activity. We could also use the number of days, weeks, or months that a business has been using our platform.
- Decision: number of ops posted (whole number)

##### operator_overall_rating: decimal

- Operator quiz feature, which gives operators a free five-star rating that is later deleted after they complete their first op, creates a good amount of noise in this feature

***

## Import the Data

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.read_csv('operatorprobability.csv')

In [3]:
df.head()

Unnamed: 0,op_is_autofill,operator_reliability_rating,operator_overall_rating,operator_ops_completed_count,ops_posted_by_business_count,is_in_business_labor_pool,is_accepted
0,True,0.84,0.888086,0.230263,0.028457,False,False
1,True,0.84,0.888086,0.230263,0.280809,True,False
2,True,0.84,0.888086,0.230263,0.280809,True,False
3,True,0.84,0.888086,0.230263,0.097131,False,False
4,True,0.84,0.888086,0.230263,0.003998,False,False
