# Prediction of financial distress

## Description

This dataset (taken from [here](https://www.kaggle.com/shebrahimi/financial-distress)) contains data temporal data abount several hundreds of companies. There are ~80 features for each company times a selected window with one target variable: the level of "financial distress" the company currently faces.

## Task

Predict whether a company faces financial distress based on the current and past input features.

## Dataset columns description

```
Company            - Company id
Time               - Timestamp id for each company; varies between 1 to 14.
Financial Distress - The target variable
x1-x80             - Financial and non-financial characteristics for ech company.
```

Notes from authors:
- If the target "Financial Distress" variable if it is greater than -0.50 the company should be considered healthy. Otherwise, it would be regarded financially distressed
- Input features belong to the previous time period, which should be used to predict whether the company will be financially distressed or not
- Feature x80 is a categorical variable
- The data set is imbalanced. There are 136 financially distressed companies against 286 healthy ones i.e.136 firm-year observations are financially distressed while 3546 firm-year observations are healthy.


Other Notes:
- 1 timestep = 1 year (authors refer to firm-year observations)
- All "positive" time series (companies that are financially distressed) ends with the target variable below the threshold. There are no companies that becomes "healthy" again. 
- Also all positive samples ends after one year in financial distress. Once the target variable becommes less then 0.5, it ends.
- The above points leads to a hypothesis that the dataset is actually about companies going bankrupt or defaulted loans.

## Imports & Drive mount

In [None]:
%tensorflow_version 2.x
%matplotlib inline

import random

import numpy as np
import pandas as pd

import matplotlib
import matplotlib.pyplot as plt

from tqdm.notebook import tqdm

In [None]:
matplotlib.rcParams['figure.figsize'] = (15, 6)
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

Mounted at /content/drive/


## Data loading and basic analysis

In [None]:
df = pd.read_csv('/content/drive/My Drive/ml-college/time-series-analysis/data/Financial Distress.csv')
df

Unnamed: 0,Company,Time,Financial Distress,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,...,x44,x45,x46,x47,x48,x49,x50,x51,x52,x53,x54,x55,x56,x57,x58,x59,x60,x61,x62,x63,x64,x65,x66,x67,x68,x69,x70,x71,x72,x73,x74,x75,x76,x77,x78,x79,x80,x81,x82,x83
0,1,1,0.010636,1.2810,0.022934,0.87454,1.21640,0.060940,0.188270,0.52510,0.018854,0.182790,0.006449,0.85822,2.005800e+00,0.125460,6.97060,4.65120,0.050100,2.1984,0.018265,0.024978,0.027264,1.41730,9.5554,0.148720,0.66995,214.760,12.641,6.46070,0.043835,0.204590,0.35179,8.316100,0.28922,0.76606,2.5825,77.400,0.026722,1.630700,...,0.180160,1.50060,0.026224,7.05130,1174.90,5.33990,0.851280,12.837,0.061737,0.180900,209.87,-0.582550,0.471010,0.109900,0.000000,0.000000,0.22009,7.12410,15.38100,3.27020,17.8720,34.6920,30.087,12.8,7991.4,364.9500,15.8,61.476,4.0,36.0,85.437,27.07,26.102,16.000,16.0,0.2,22,0.060390,30,49
1,1,2,-0.455970,1.2700,0.006454,0.82067,1.00490,-0.014080,0.181040,0.62288,0.006423,0.035991,0.001795,0.85152,-4.864400e-01,0.179330,4.57640,3.75210,-0.014011,2.4575,0.027558,0.028804,0.041102,1.18010,7.2952,0.056026,0.67048,38.242,12.877,5.55060,0.265480,0.150190,0.41763,9.527600,0.41561,0.81699,2.6033,95.947,0.007580,0.837540,...,0.046857,1.00950,0.007864,4.60220,1062.50,3.73890,0.943970,12.881,-0.000565,0.056298,250.14,-0.474770,0.385990,0.369330,0.000000,0.000000,0.00000,7.41660,7.10500,14.32100,18.7700,124.7600,26.124,11.8,8322.8,0.1896,15.6,24.579,0.0,36.0,107.090,31.31,30.194,17.000,16.0,0.4,22,0.010636,31,50
2,1,3,-0.325390,1.0529,-0.059379,0.92242,0.72926,0.020476,0.044865,0.43292,-0.081423,-0.765400,-0.054324,0.89314,4.122000e-01,0.077578,11.89000,2.48840,0.028077,1.3957,0.012595,0.068116,0.014847,0.81652,7.1204,0.065220,0.84827,-498.390,13.225,16.25400,0.416570,0.074149,0.36723,9.351300,0.50356,0.91962,1.4931,144.670,-0.066483,0.955790,...,-0.579760,0.57832,-0.064373,11.98800,651.15,10.93400,0.934780,12.909,0.041625,0.047562,280.55,-1.000000,0.488440,0.053299,0.003785,0.005191,0.00000,3.63730,7.02130,1.15380,9.8951,6.4467,30.245,10.3,8747.0,11.9460,15.2,20.700,0.0,35.0,120.870,36.07,35.273,17.000,15.0,-0.2,22,-0.455970,32,51
3,1,4,-0.566570,1.1131,-0.015229,0.85888,0.80974,0.076037,0.091033,0.67546,-0.018807,-0.107910,-0.065316,0.89581,9.949000e-01,0.141120,6.08620,1.63820,0.093904,2.0588,0.011601,0.094385,0.014415,0.90391,7.9828,0.125160,0.80478,-75.867,13.305,8.89500,0.083774,0.054098,0.54360,7.090900,0.67133,0.93701,2.3533,219.750,-0.017000,0.383350,...,-0.150130,0.64508,-0.017731,6.11140,703.04,5.70280,0.874840,13.094,0.108400,0.101350,413.74,0.565000,0.344080,0.073356,0.000037,0.000045,0.00000,5.14420,9.90990,2.04080,-1.4903,-21.9070,34.285,11.5,9042.5,-18.7480,10.4,47.429,4.0,33.0,54.806,39.80,38.377,17.167,16.0,5.6,22,-0.325390,33,52
4,2,1,1.357300,1.0623,0.107020,0.81460,0.83593,0.199960,0.047800,0.74200,0.128030,0.577250,0.094075,0.81549,3.014700e+00,0.185400,4.39380,1.61690,0.239210,3.0311,0.006814,0.079346,0.008876,1.02510,4.7463,0.266020,0.76770,1423.100,11.575,17.48800,0.620770,0.046907,0.56963,9.486100,0.68143,0.94242,4.1296,222.650,0.131230,0.253010,...,0.607660,0.25782,0.131380,4.41510,2465.40,4.14080,0.733980,11.396,0.250310,0.222370,315.34,-0.060101,0.202420,1.229100,-0.002491,-0.002980,0.22688,7.12410,15.38100,3.27020,17.8720,34.6920,30.087,12.8,7991.4,364.9500,15.8,61.476,4.0,36.0,85.437,27.07,26.102,16.000,16.0,0.2,29,1.251000,7,27
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3667,422,10,0.438020,2.2605,0.202890,0.16037,0.18588,0.175970,0.198400,2.22360,1.091500,0.241640,0.226860,0.35580,1.550000e+07,0.839630,0.19101,12.09200,0.946730,8.1062,0.014077,0.000000,0.089439,0.52242,2.5184,0.994340,0.15740,390.260,18.296,0.93687,-0.007698,0.002976,0.34999,0.013104,1.88290,0.98144,1431.7000,29.772,0.570230,0.003544,...,2.748900,0.23629,1.265100,0.19102,1615.10,0.18746,0.005662,16.613,0.948040,0.184820,16961.00,-0.060838,0.000130,0.488130,0.000000,0.000000,0.00000,3.06210,3.95540,-0.53449,-9.3020,11.2070,20.132,12.3,13568.0,14.5290,21.5,33.768,2.0,22.0,100.000,100.00,100.000,17.125,14.5,-7.0,37,0.436380,4,41
3668,422,11,0.482410,1.9615,0.216440,0.20095,0.21642,0.203590,0.189870,1.93820,1.000100,0.270870,0.213610,0.38734,1.920000e+07,0.799050,0.25149,2.63990,0.940730,14.4280,0.018249,0.000000,0.092416,0.55872,2.7304,0.992440,0.19747,443.840,18.360,1.13980,0.066060,0.003484,0.38274,0.010072,1.76850,0.98267,1907.8000,136.370,0.558780,0.004359,...,2.730700,0.23762,1.077100,0.25152,1638.50,0.24713,0.007563,16.829,0.942850,0.214780,20689.00,0.064229,0.000113,0.144840,0.000000,0.000000,0.00000,-7.71400,-11.82400,-25.73600,-21.4110,46.8440,30.046,12.2,26059.0,3.8523,30.5,-10.665,0.0,28.0,91.500,130.50,132.400,20.000,14.5,-16.0,37,0.438020,5,42
3669,422,12,0.500770,1.7099,0.207970,0.26136,0.21399,0.193670,0.183890,1.68980,0.971860,0.281560,0.210970,0.44290,2.030000e+07,0.738640,0.35384,1.49010,0.905010,39.0520,0.007451,0.000000,0.028768,0.48316,2.9064,0.982420,0.25902,475.560,18.469,1.16370,0.115120,0.002343,0.43769,0.022285,2.04530,0.99104,2221.2000,241.600,0.469560,0.003172,...,2.824600,0.24895,0.795720,0.35392,1689.00,0.35067,0.017581,16.927,0.907060,0.210230,34012.00,0.034509,0.000096,0.035889,0.000000,0.000000,0.00000,-0.32511,-0.71099,-3.06590,-17.8990,107.7100,38.823,10.4,31839.0,-25.8410,34.7,36.030,2.0,32.0,87.100,175.90,178.100,20.000,14.5,-20.2,37,0.482410,6,43
3670,422,13,0.611030,1.5590,0.185450,0.30728,0.19307,0.172140,0.170680,1.53890,0.960570,0.267720,0.203190,0.47601,1.099600e+02,0.692720,0.44358,1.38370,0.891630,39.6790,0.021239,0.008109,0.069562,0.40559,2.5221,0.985230,0.30533,457.060,18.543,1.13120,0.077785,0.001942,0.46988,0.023538,2.43380,0.99368,2685.8000,260.160,0.389600,0.002803,...,2.422700,0.24639,0.603540,0.44367,1707.30,0.44077,0.014773,16.899,0.894050,0.190210,35901.00,-0.019839,0.000072,0.120520,-0.001503,-0.007784,0.00000,3.21480,1.57190,7.15620,-4.5408,-20.8610,22.334,10.6,32801.0,-58.1220,15.6,22.571,2.0,30.0,92.900,203.20,204.500,22.000,22.0,6.4,37,0.500770,7,44


In [None]:
df.describe()

Unnamed: 0,Company,Time,Financial Distress,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,...,x44,x45,x46,x47,x48,x49,x50,x51,x52,x53,x54,x55,x56,x57,x58,x59,x60,x61,x62,x63,x64,x65,x66,x67,x68,x69,x70,x71,x72,x73,x74,x75,x76,x77,x78,x79,x80,x81,x82,x83
count,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,...,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0,3672.0
mean,182.084423,7.528322,1.040257,1.38782,0.129706,0.615769,0.8681599,0.154949,0.106717,0.784031,39.274361,0.33261,0.136263,0.638835,125273.0,0.38423,2.438322,8185.969,-25.039036,2058.577,0.04138,77.549137,0.103999,1.449663,14.19506,0.273236,0.532118,996.000716,13.288179,-77.482669,0.202575,0.083651,0.354824,358087.1,379.867463,0.863141,2859.282,68878.39,0.240324,0.312063,...,3.321408,0.149245,0.298761,1490.89,2484.105343,2.12626,0.726764,12.929332,-21.252442,0.207304,3411.267857,0.007702,0.237413,1.02709,-0.010018,-6.869395,0.113802,2.694738,3.456374,1.299225,-0.703818,31.677402,28.245979,11.459051,15874.639434,21.472377,17.863013,23.30136,1.923501,30.407166,86.839822,91.920506,89.115908,17.780855,15.198708,-2.664305,19.714597,1.100488,13.122277,33.044935
std,117.024636,4.064016,2.652227,1.452926,0.120013,0.177904,0.5719519,0.124904,0.210555,1.033606,4305.688039,0.346135,0.138978,0.201986,1468207.0,0.177904,2.377307,108672.5,1231.623609,60229.26,0.045379,3256.469121,0.231024,0.968705,227.2613,0.171174,0.182469,1822.605137,1.614,4759.884902,0.285898,0.092991,0.170703,8275936.0,10444.296575,0.140751,76330.68,2337751.0,0.29484,0.504356,...,58.382438,0.764287,0.712088,45586.37,2927.820444,2.162011,0.171174,1.761527,1209.468675,0.130165,14441.801918,0.416617,0.139899,25.636779,0.026567,290.276732,0.089494,3.922626,6.953459,8.738037,12.321989,43.852586,6.146999,0.952858,10026.105024,97.191938,7.119361,17.834175,1.469196,3.714512,16.706209,64.656504,64.349382,2.040152,2.828648,8.192663,7.508588,2.666733,9.465907,13.714563
min,1.0,1.0,-8.6317,0.07517,-0.25808,0.016135,5.35e-07,-0.26979,-0.62775,0.03516,-145000.0,-3.6112,-0.31866,0.021491,-2620000.0,0.032101,0.0164,3.54e-06,-35758.0,0.0,0.000102,0.0,0.000127,3e-06,7.73e-07,-0.55385,0.016135,-3374.6,8.1951,-288000.0,-0.49944,0.0,0.002082,0.0,0.01006,0.049372,4.56e-05,0.0,-4.8562,0.0,...,-56.719,-9.3769,-1.0749,0.016401,533.07,0.0164,0.0,0.0,-33563.0,-0.22522,0.010753,-1.0,0.0,-0.99871,-0.27926,-15649.0,-0.34906,-7.714,-11.824,-25.736,-21.411,-21.907,15.916,10.3,7941.8,-58.122,10.4,-10.665,0.0,22.0,54.806,24.318,23.776,15.25,12.0,-20.2,1.0,-0.49922,1.0,2.0
25%,80.0,4.0,0.172275,0.952145,0.048701,0.501888,0.5525575,0.070001,-0.027754,0.436003,0.056185,0.157677,0.03382,0.503125,1.882775,0.250575,1.00755,2.29475,0.088584,1.678475,0.013266,0.013259,0.024805,0.90423,1.925625,0.15648,0.40091,261.765,12.163,-2.066775,0.03521,0.025747,0.227492,4.687925,0.27028,0.81875,2.4429,42.169,0.072883,0.066656,...,0.18788,-0.097194,0.068385,1.01485,1418.925,0.82287,0.63129,11.84475,0.112645,0.116778,443.7025,-0.15414,0.13515,-0.04115,-0.01261,-0.016715,0.005913,-0.075348,-0.71099,-0.53449,-9.302,2.6566,23.905,10.5,9042.5,-25.841,11.9,10.131,0.5,28.0,79.951,39.8,38.377,16.0,13.0,-7.0,14.0,0.189912,6.0,21.0
50%,168.0,7.0,0.583805,1.1836,0.10753,0.63869,0.775245,0.13183,0.104325,0.641875,0.135585,0.30261,0.10727,0.670855,4.5391,0.36131,1.7677,3.91715,0.1759,2.4614,0.028375,0.032881,0.053568,1.2629,3.9662,0.24757,0.5425,605.195,13.139,2.94435,0.153235,0.052712,0.346125,9.41795,0.42715,0.911095,3.66835,91.832,0.1646,0.15748,...,0.493065,0.28983,0.174985,1.7921,1917.7,1.4655,0.75243,12.797,0.207605,0.18669,962.645,0.04753,0.22259,0.071478,-0.002939,-0.004075,0.126035,3.6373,5.7874,1.8883,1.3445,28.286,28.184,11.3,9667.3,0.1896,15.6,22.571,2.0,30.0,90.0,66.12,59.471,17.0,14.5,0.2,20.0,0.594765,11.0,34.0
75%,264.25,11.0,1.35175,1.506475,0.188685,0.749425,1.039,0.21957,0.23123,0.896773,0.273423,0.484035,0.210017,0.80492,15.07025,0.498112,2.9908,8.497025,0.300342,4.266325,0.052265,0.065488,0.108332,1.741775,7.779825,0.36871,0.67189,1260.25,14.152,6.90755,0.300755,0.106177,0.469145,23.26425,0.649988,0.95702,5.764275,156.45,0.336557,0.354313,...,1.103175,0.599633,0.351883,3.057425,2812.5,2.610225,0.84352,13.8035,0.345973,0.283348,2176.0,0.219235,0.319823,0.273432,0.0,0.0,0.196305,5.7265,8.4799,7.1562,5.8267,57.368,30.245,12.2,26059.0,14.529,21.5,36.03,4.0,33.0,93.883,130.5,132.4,20.0,16.0,2.1,26.0,1.35505,17.0,44.0
max,422.0,14.0,128.4,51.954,0.74941,0.9679,6.8356,0.85854,0.92955,38.836,209000.0,3.8102,0.76962,0.99827,38300000.0,0.98386,30.152,3960000.0,42180.0,3540000.0,0.59655,140000.0,5.9646,13.398,10721.0,1.0,0.93563,79551.0,19.106,9327.4,4.6254,0.69046,0.948,361000000.0,312000.0,1.0,4350000.0,102000000.0,5.7661,10.536,...,3201.9,6.0524,34.063,1660000.0,136000.0,26.398,1.5538,19.809,46045.0,0.90456,342000.0,1.0,0.92653,1182.8,0.24727,1.4029,0.86332,7.4166,15.381,14.321,18.77,124.76,39.432,13.85,34501.0,364.95,34.7,61.476,4.0,36.75,120.87,227.5,214.5,22.0,22.0,8.6,37.0,128.4,49.0,74.0


^ From the data we can see that:
- There are 422 companies in the dataset
- There are max 14 time steps as advertised
- Some of the features contains huge outliers (x8, x12, ..)
- Besides x80, some other features seems to be rather categorical (x72, x79, ..)