# Creating Bank Data
> These are the steps taken to create data to mimic a bank. The final function created will be contained in a Data.py module. This notebook is not required to complete this challenge. However, if you make changes to the function, I suggest running any changes in this notebook  or another first before altering the Data.py module.  

Our data will contain:
- a unique transaction id
- randomly chosen:
    - account total in dollars (balance)
    - interest rate
- calculated interest 
    - (balance * interest rate)
- new total of total + interest 
    - (balance + calculated interest)
- 100,000 observations

### Feel free to tweak the data with your own specifications on interest, observations, etc.

In [1]:
# to create random values
import numpy as np

# to create a dataframe, calculations, etc.
import pandas as pd

In [43]:
# set the number of observations for random values as 100,000 transactions
observations = 100_000

# Account Balances
- creating random values for total balance in bank account for the transaction
- np.random.normal draws random samples from a normal distribution
    - np.random.normal(mean, stddev, size)
- the average american today has $8,863 in their bank account
    - https://www.cnbc.com/2019/03/11/how-much-money-americans-have-in-their-savings-accounts-at-every-age.html

In [106]:
totals = np.random.normal(8_000, 1_500, observations)
totals.min()

1463.25805072399

# Interest Rates
- creating random decimal values for interest rates of each transaction
- np.random.random draws random samples from the interval [0.0, 1.0)
    - range does not include 0, includes 1
    - np.random.random(size)
- banks have an average interest rate of .06% on checkings accounts
    - https://www.valuepenguin.com/banking/average-bank-interest-rates#:~:text=The%20average%20bank%20interest%20rate,market%20interest%20rate%20is%200.16%25.

In [133]:
interest = np.random.random(observations) * .1
interest

array([0.01582813, 0.07602612, 0.03427196, ..., 0.09894789, 0.08258765,
       0.02603719])

# Creating the df
- the transaction id will be the df index
- we created random balances and interest, these will be added to the df
- next, we'll calculate the interest and new total
- finally, we will put it all together into a repeatable function

In [126]:
# creating an empty df with an index starting at. and ending at 1,000
df = pd.DataFrame(index=range(1,1001))

In [127]:
# creating a column in the df as the account balance using random array created earlier
df['balance'] = pd.DataFrame(totals)

In [128]:
# creating a column in the df as the interest rate using random array created earlier
df['interest_rate'] = pd.DataFrame(interest)

In [129]:
# taking a look at the df so far
df

Unnamed: 0,balances,interest_rate
1,6824.016840,0.022580
2,7422.930463,0.056376
3,9414.588795,0.043811
4,7175.198852,0.001633
5,6741.519427,0.043785
...,...,...
996,6754.250045,0.069544
997,7426.375854,0.010807
998,5993.262779,0.039739
999,9011.671929,0.037660


In [130]:
# creating a new column by calculating the interest with our existing columns
# the amount of interest earned is the account balance times the interest rate
df['calculated_interest'] = df.balance * df.interest_rate

In [131]:
# creating a new column by calculating the total with our existing columns
# the new total is the account balance plus the interest earned
df['new_balance'] = df.balance + df.calculated_interest

In [132]:
# taking a look at the complete df
df

Unnamed: 0,balances,interest_rate,calculated_interest,new_balance
1,6824.016840,0.022580,154.083813,6978.100654
2,7422.930463,0.056376,418.476177,7841.406640
3,9414.588795,0.043811,412.461519,9827.050314
4,7175.198852,0.001633,11.719498,7186.918350
5,6741.519427,0.043785,295.178718,7036.698145
...,...,...,...,...
996,6754.250045,0.069544,469.715928,7223.965973
997,7426.375854,0.010807,80.259155,7506.635009
998,5993.262779,0.039739,238.166970,6231.429749
999,9011.671929,0.037660,339.380441,9351.052370


# Now, Put it All Together in a Function

In [134]:
def create_data():
    '''
    Creates a df of fake bank transactions.
    Includes the account total, interest rate, earned interest, and total after interest.
    Generates different random values each time function is run.
    '''
    # set the number of observations for random values as 100,000 transactions
    observations = 100_000
    
    # random floats generated with average of 8,000 and stddev of 1,500
    totals = np.random.normal(8_000, 1_500, observations)
    
    # random floats generated from [0, 1.0)
    # multiply by .1 to mimick average interest rates, typically .06
    interest = np.random.random(observations) * .1
    
    # creating an empty df with an index starting at. and ending at 1,000
    df = pd.DataFrame(index=range(1,1001))
    
    # creating a column in the df as the account balance using random array created earlier
    df['balances'] = pd.DataFrame(totals)
    
    # creating a column in the df as the interest rate using random array created earlier
    df['interest_rate'] = pd.DataFrame(interest)
    
    # creating a new column by calculating the interest with our existing columns
    # the amount of interest earned is the account balance times the interest rate
    df['calculated_interest'] = df.balance * df.interest_rate
    
    # creating a new column by calculating the total with our existing columns
    # the new total is the account balance plus the interest earned
    df['new_balance'] = df.balance + df.calculated_interest
    
    return df

# Testing the function created

In [135]:
df_function = create_data()
df_function

Unnamed: 0,balances,interest_rate,calculated_interest,new_balance
1,8918.544688,0.016401,146.273274,9064.817961
2,7692.612869,0.092795,713.835824,8406.448693
3,7024.140237,0.070404,494.525912,7518.666149
4,8442.240542,0.015295,129.126693,8571.367235
5,8989.550043,0.018346,164.921703,9154.471747
...,...,...,...,...
996,7722.487787,0.006990,53.983635,7776.471422
997,6700.474357,0.050216,336.468273,7036.942630
998,7673.842980,0.088342,677.921952,8351.764932
999,6476.623124,0.041125,266.348838,6742.971963


# ^Looking Good!