#  Credit Scoring with Snowpark for Python Set-up Notebook
Author: Zohar Nissare-Houssen

## 1. Snowflake Trial Account

The prerequisite is to have a Snowflake account. If you do not have a Snowflake account, you can sign-up for a free 30 day [Snowflake trial](https://signup.snowflake.com/).

After signing-up for the trial, please bookmark the URL of the Snowflake account, and save your credentials as they will be needed in this lab.


This version requires Snowpark **0.4.0** or higher

## 2. Python Libraries

The following libraries are needed to run this demo. In this section, add any python library missing in your environment.

In [None]:
pip install scikit-plot

In [None]:
pip install pyarrow==6.0.0

In [None]:
pip install seaborn

In [None]:
pip install matplotlib

## 3. File Download

### 3.1 The Dataset

In [None]:
! curl -O https://raw.githubusercontent.com/zoharsan/snowpark_credit_score/main/credit_files.csv

In [None]:
! curl -O https://raw.githubusercontent.com/zoharsan/snowpark_credit_score/main/credit_request.csv

### 3.2 The creds.json credential file

The file below needs to be edited with credentials of your Snowflake account and saved. It will be used to connect to Snowflake on the main Notebook:


```
{
  "account": "<account-name>",
  "user": "<user>",
  "password": "<password>",
  "warehouse": "<warehouse-name>",
  "database": "CREDIT_BANK",
  "schema": "PUBLIC"
}
```   

In [1]:
! curl -O https://github.com/zoharsan/snowpark_credit_score/raw/main/creds.json

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   146  100   146    0     0    618      0 --:--:-- --:--:-- --:--:--   616


## 4. The Database

In the section below, please fill-up the different parameters to connect to your Snowflake Environment in the cell below.

In [None]:
from snowflake.snowpark import *
from snowflake.snowpark import version
from snowflake.snowpark.functions import *

import pandas as pd
connection_parameters = {
    "user": "<user_name>",
    "password": "<password>",
    "account": "<account_name>",
    "warehouse": "<warehouse_name>",
}

session = Session.builder.configs(connection_parameters).create()

session.sql("create or replace database credit_bank").collect()
session.sql("use schema credit_bank.public").collect()
print(session.sql("select current_warehouse(), current_database(), current_schema(), current_user(), current_role()").collect())

## 5. The Tables

There are 2 tables associated with this demo:

* CREDIT_FILES: This table contains currently the credit on files along with the credit standing whether the loan is being repaid or if there are actual issues with reimbursing the credit. This dataset is going to be used for historical analysis and build a machine learning model to score new applications.

* CREDIT_REQUESTS: This table contains the new credit requests that the bank needs to provide approval on based on the ML algorithm.


### 5.1 CREDIT_FILES Table



After check running the command below, log into your Snowflake environment and make sure the table was created. It should have 2.9K rows. DO NOT RUN THIS TWICE. Otherwise, it will append the rows twice making the ML model appear overfitting. If you need to rerun it, drop the table first (from the snowflake console or here following the syntax above eg ```session.sql("drop table CREDIT_FILES").collect()```

In [None]:
credit_files = pd.read_csv('credit_files.csv')
session.write_pandas(credit_files,"CREDIT_FILES",auto_create_table='True')

In [None]:
credit_df = session.table("CREDIT_FILES")
credit_df.schema

In [None]:
credit_df.toPandas().head()

In [None]:
credit_df.toPandas().info()

### 5.2 CREDIT_REQUEST Table

After check running the command below, log into your Snowflake environment and make sure the table was created. It should have 60 rows. DO NOT RUN THIS TWICE. Otherwise, it will append the rows twice If you need to rerun it, drop the table first (from the snowflake console or here following the syntax above eg ```session.sql("drop table CREDIT_REQUESTS").collect()```

In [None]:
credit_requests = pd.read_csv('credit_request.csv')
session.write_pandas(credit_requests,"CREDIT_REQUESTS",auto_create_table='True')

In [None]:
credit_req_df = session.table("CREDIT_REQUESTS")
credit_req_df.schema

In [None]:
credit_req_df.toPandas().head()

In [None]:
credit_req_df.toPandas().info()