Public authorities are required by Section 2800 of Public Authorities Law to submit annual reports to the Authorities Budget Office that include loans data. Local development corporations are required to report information on the projects they support and how those approved projects are financed (either through grants, loans, or bonds). The dataset consists of loans data reported by Local Development Corporations beginning with fiscal years in 2011.

Let's explore what we can infer from the data.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
#importing the required libraries
import warnings
warnings.filterwarnings("ignore")

import seaborn as sns
import matplotlib.pyplot as plt

import datetime as dt


pd.set_option('display.max_columns',500)
pd.set_option('display.max_rows',500)

In [None]:
NYC = pd.read_csv("/kaggle/input/new-york-city-corporate-loans/NYC_Loans.csv")
NYC.head()

In [None]:
NYC.shape

In [None]:
NYC.info()

In [None]:
#checking the null values of the dataset.
NYC.isnull().sum(axis=0)/len(NYC)*100

In [None]:
NYC.Loans.value_counts(normalize=True)

##### Here, Loans column contain only "No" values which is not true. When we observe the column closely, "NaN"'s represents "Yes" values.So replaced them with "Yes".

In [None]:
NYC.Loans.fillna("Yes",inplace=True)
NYC.head()

In [None]:
NYC.Loans.value_counts(normalize=True)

In [None]:
NYC["Loan Fund Sources"].value_counts(normalize=True).plot.pie()

In [None]:
NYC["Loan Terms Completed"].value_counts(normalize=True)


In [None]:

NYC["Loan Purpose"].value_counts(normalize=True).plot.barh()


In [None]:
NYC["New Jobs"].value_counts(normalize=True)

In [None]:
NYC["Loan Terms Completed"].value_counts(normalize=True)

In [None]:
NYC["Recipient State"].value_counts(normalize=True).head(5)

In [None]:
NYC["Recipient Name"].nunique()

The above value indicates that recipient names are repeated in the data, lets see the list of tehem

In [None]:
Recipient_Dups = NYC.pivot_table(index = ['Recipient Name'], aggfunc ='size')
Recipient_Dups

##### We are proceeding with the data that has granted loans to the recipients for further EDA.

In [None]:
NYC = NYC[NYC["Loans"] == "Yes"]
NYC.head()

In [None]:
#Displaying the null values of the new NYC data.
NYC.isnull().sum(axis=0)/len(NYC)*100

##### Displaying the fund sources contribution in terms of Max fund amount.

In [None]:
NYC_Fund = NYC.groupby("Loan Fund Sources")["Original Loan Amount"].sum().plot.pie()
NYC_Fund

##### Getting the data of the recipients who paid the total loan amount

In [None]:
NYC_Fullpaid = NYC[NYC["Amount Repaid"] >= NYC["Original Loan Amount"]]
NYC_Fullpaid.head()

#####  Recipients list who has cleared the loan amounts.

In [None]:
Recipient_list = NYC_Fullpaid.groupby(["Recipient Name"],as_index=False)["Original Loan Amount"].sum()
Recipient_list.sort_values(by=["Original Loan Amount"],ascending=False,inplace=True)
Recipient_list.head()

In [None]:
NYC_Fullpaid["Loan Purpose"].value_counts(normalize=True).plot.barh()

#####  The above fig indicates that Business Expansion/Startup's are taken max loans and able to repay the loans compared to other firms.

##### Let's see the data who has not yet started repaying the loan amounts.

In [None]:
NYC_Unpaid = NYC[NYC["Amount Repaid"] == 0]
NYC_Unpaid.head()

##### List of authorities who has granted highest loan amounts

In [None]:
Authority_list = NYC_Unpaid.groupby(["Authority Name"],as_index = False)["Original Loan Amount"].sum()
Authority_list.sort_values(by=["Original Loan Amount"],ascending=False,inplace=True)
Authority_list.head()

##### Recipients list who has taken highest loan amounts.

In [None]:
Recipient_list = NYC_Unpaid.groupby(["Recipient Name"],as_index=False)["Original Loan Amount"].sum()
Recipient_list.sort_values(by=["Original Loan Amount"],ascending=False,inplace=True)
Recipient_list.head()

* ##### List of fund sources that allotted funds

In [None]:
Fund_list = NYC_Unpaid.groupby(["Loan Fund Sources"],as_index = False)["Original Loan Amount"].sum()
Fund_list.sort_values(by=["Original Loan Amount"],ascending=False,inplace=True)
Fund_list

In [None]:
NYC_Unpaid["Loan Purpose"].value_counts(normalize=True).plot.barh()

#####  The above fig indicates that Commercial Property Construction/Acquisition/Revitalization/Improvement are high in numbers in terms of loan dues.

Thank You!!!