# Python Data Science & Analysis 
### Project: Credit Risk Assessment 

# Abstract

You are hired as part of a data science team at a fintech start up. The start up has offered loans of $£1,000$ to $1000$ customers in various groups of interest in order to collect data on their likelihood of repayment.  

Your roles is to offer an account or predictive model of what factors lead to loan default. And thereby advise the new company on its loan strategy. 

The company has collected the following data:

```
 "ID",         Customer ID
 "Income",     Annual Pre-Tax Income on-application
 "Term" ,      Short or Long Term (6mo or 12mo)
 "Balance",    Current Account Balance on-application
 "Debt" ,      Outstanding Debt on-application
 "Score",      Credit Score (from referencing agency)
 "Default"     Observed Default (True = Default, False = Settle
```

# Part 1: Rules

Your first project is to consider the application of a single customer and prototype rules which could predict whether they defaulted or not. 

```python

customer = (690, 14300., 'Short Term', 1190., 87., 63., False)

```

### Q. Define a variable `columns` to hold the column names

Goal: your code should contain a variable which lists the names of the columns in text. 

### Q. Define `customer` as above

Goal: include the customer variable defined above. 

### Q. Print the customer details out (the field name and value)

Goal: your code should `print()` details of the customer's loan. 

The output can anything so long as you can see, eg., their income `14300`. 

Consider formatting the output futher than this, eg., use `columns` above. 

```
SAMPLE OUTPUT:

ID 690
Income 14300.0
Term Short Term
Balance 1190.0
Debt 87.0
Score 63.0
Default False
```

### Q. Print a prediction and observation

1. Goal: print the observed default 
    * HINT: the last element of `customer`
    * HINT: perhaps include the word "observation" in the output
    
---
2. Goal: compute and print a prediction
    * eg., include an `if` that the customer score is $< 200$
        * print `True` for the prediction
        * otherwise, `False`
    
    

```
SAMPLE OUTPUT:

observation: False
prediction: True
```

### Q. Improve the prototype rule: consider income

* Goal: Include a condition on the customer income
    * HINT: eg., that the income is `< 25_000`
    * HINT: ie., modify your `if` condition above

```
SAMPLE OUTPUT:

observation: False
prediction: True
```

### Q. Improve the prototype rule: consider term

* Goal: Include a condition on the customer term
    * HINT: eg., that `"Long"` is `in` the term
    * HINT: eg., use `or` to combine conditions

```
SAMPLE OUTPUT:

observation: False
prediction: True
```

# Part 2: Datasets

The company now provides a partial dataset. 

Your task is to apply your rules above and estimate your prediction error using them. 

### Q. Print out customer details

Goal: your code should `print()` all loan details. 

* Start by defining a loop over `loans` 
* and printing eg., the customer id and income

ie., consider your `print()` code above defined for a single `customer` and use it with a loop. 

```
SAMPLE OUTPUT:

ID: 215  	 Income: 37900.0
ID: 442  	 Income: 78700.0
ID: 22  	 Income: 41900.0
ID: 711  	 Income: 24600.0
ID: 113  	 Income: 33900.0
ID: 91  	 Income: 23200.0
ID: 268  	 Income: 17700.0
ID: 735  	 Income: 37100.0
ID: 971  	 Income: 35300.0
ID: 858  	 Income: 16700.0
```

### Q. Print the score

Goal: print their score if it exists. 


* Modify you loop above to print the customer score
    * HINT: note that one of the scores is `None`
    * HINT: include an `if`
        * to skip the loop when `score is None` use `continue`


```
SAMPLE OUTPUT:

Income: 37900	ID: 215  	 Score: 595
Income: 78700	ID: 442  	 Score: 1000
Income: 41900	ID: 22  	 Score: 372
Income: 24600	ID: 711  	 Score: 385
Income: 33900	ID: 113  	 Score: 456
Income: 23200	ID: 91  	 Score: 264
Income: 17700	ID: 268  	 Score: 289
Income: 37100	ID: 735  	 Score: 661
Income: 16700	ID: 858  	 Score: 201
```

### Q. Include predictions

* Goal: Compute a prediction using a rule above
    * eg., `prediction = (score < 200)` 

---

* Goal: print all predictions.
    * HINT: modify your loop to print `prediction`
---
* `print()` whether the prediction matches the observation
    * HINT: compute  `prediction == default`

```
SAMPLE OUTPUT:

Income: 37900		Score: 595	Error: True
Income: 78700		Score: 1000	Error: True
Income: 41900		Score: 372	Error: True
Income: 24600		Score: 385	Error: True
Income: 33900		Score: 456	Error: True
Income: 23200		Score: 264	Error: True
Income: 17700		Score: 289	Error: False
Income: 37100		Score: 661	Error: True
Income: 16700		Score: 201	Error: False
```

### Q. Include an error

Goal: Start a running error total from `0` before the loop.

* define and initialize an `error` variable 
---

Goal: Modify the loop so that you increase error by one.
* HINT: *when* `prediction != default`
* HINT: `+= 1`

```
SAMPLE OUTPUT:

Income: 37900		Score: 595	Error: True
Income: 78700		Score: 1000	Error: True
Income: 41900		Score: 372	Error: True
Income: 24600		Score: 385	Error: True
Income: 33900		Score: 456	Error: True
Income: 23200		Score: 264	Error: True
Income: 17700		Score: 289	Error: False
Income: 37100		Score: 661	Error: True
Income: 16700		Score: 201	Error: False

Total Error: 2
```

### Q. Report an accuracy score

Goal: `print()` an accuracy out of $100\%$

Compute the accuracy as $1 - \frac{error}{N_{loans}}$.
* HINT: `n = len(loans)`

Compute a score out of $100\%$. 
* HINT: multiply by $100$ to report a percentage

```
SAMPLE OUTPUT:

Score: 80 %
```