Skip to content

Exploration, analysis and application of machine learning models to predict companies' status

License

Notifications You must be signed in to change notification settings

a-brice/bankruptcy-data-exp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bankruptcy-data-exp

The dataset is provided by Sebastian Tomczak and collected from Emerging Markets Information Service (EMIS) : https://archive.ics.uci.edu/ml/datasets/Polish+companies+bankruptcy+data

STARTER BELLOW

The dataset is about bankruptcy prediction of Polish companies. In theses datasets, we retrieve information about emerging markets around the word (or Poland, who knows ?). A dataset is composed of thousands of rows where each row corresponds to a company. The attribute about theses companies is given in data/description.txt file. Here, is a sample of what we can have in a dataset :

bankrupt
0 0.034279 0.42448 -0.075832 0.67532 -77.334 -0.01497 0.044048 1.3558 1.1287 0.57552 0.044048 0.1886 0.11021 0.044048 2069.8 0.17635 2.3558 0.044048 0.064853 22.179 1.0305 0.077574 0.050469 -0.016044 0.57552 0.15333 1.2892 -0.090033 5.1839 0.61859 0.064853 141.67 2.5764 0.18275 0.077574 0.67974 0.60997 0.76644 0.11421 0.04225 0.12876 0.11421 79.459 57.28 0.83056 0.49861 25.035 0.046766 0.068854 0.37158 0.23356 0.38815 0.6833 0.90997 -11581.0 0.11406 0.059561 0.88594 0.33173 16.457 6.3722 125.51 2.908 0.80639
0                                         0.096308                                         0.50574                                         0.48163                                         1.9523                                         229.04                                         0                                         0.096308                                         0.97731                                         3.7981                                         0.49426                                         0.15378                                         0.19043                                         0.42351                                         0.096308                                         114.76                                         3.1806                                         1.9773                                         0.096308                                         0.025357                                         6.514                                         0.60105                                         0                                         0.025357                                         0.32281                                         0.45095                                         3.1806                                         0                                         38.13                                         3.0624                                         0.026525                                         0.059985                                         85.534                                         4.2673                                         4.2673                                         0.0045052                                         3.7981                                         ?                                         0.49426                                         0.0011862                                         0.80652                                         0.011148                                         0                                         55.688                                         49.174                                         1.4208                                         1.8183                                         11.464                                         -1.5122                                         -0.39815                                         1.9523                                         0.50574                                         0.23434                                         39.13                                         39.13                                         556.01                                         0.43179                                         0.19485                                         0.58486                                         0                                         56.033                                         7.4227                                         48.601                                         7.5101                                         300.69                                        
0 -0.20902 1.2022 -0.2562 0.053378 -108.75 -0.38107 -0.20902 -0.16822 0.82685 -0.20224 -0.14916 -0.77232 -0.098138 -0.20902 -5407.8 -0.067495 0.83178 -0.20902 -0.25279 0 ? -0.14916 -0.25279 -0.20902 -0.59009 -0.067495 -2.4917 -0.25995 2.8692 1.454 -0.1804 101.21 3.6062 0.81183 -0.14916 0.82685 0.015507 0.72936 -0.1804 9.9866e-006 -1.883 -0.1804 6.376 6.376 ? 0.053378 0 -0.27704 -0.33505 0.012016 0.27064 0.2773 -0.20521 0.74005 -189.58 -0.1804 1.0335 1.2528 -4.6064 ? 57.246 119.47 3.0551 0.83897
0 0.20097 0.19291 0.23709 2.229 93.472 0 0.20097 4.1836 2.8936 0.80709 0.20097 1.0418 0.41244 0.20097 59.001 6.1864 5.1836 0.20097 0.069453 9.6467 0.85696 0 0.069453 0.51819 0.76878 6.1864 ? 0.41595 3.7389 0.040859 0.15536 43.706 8.3512 8.3512 0.0066855 2.8936 ? 0.80709 0.0023105 0.38714 0.0064793 0 44.82 35.174 2.6279 1.8326 17.326 -0.99247 -0.34299 2.229 0.19291 0.11974 1.416 1.416 1299.7 0.44323 0.24901 0.55688 0 37.837 10.377 24.334 14.999 5.0765
0 -0.11132 0.64559 0.0041018 1.0071 -38.084 0 -0.11132 0.54897 2.5568 0.35441 -0.026645 -0.19222 -0.022736 -0.11132 -4053.6 -0.090045 1.549 -0.11132 -0.043539 37.421 1.0872 -0.09603 -0.043539 -0.1175 -0.085977 -0.090045 -1.1341 0.0098424 3.3675 0.25123 -0.033709 81.987 4.4519 3.9938 -0.021489 2.5568 7.0439 0.4 -0.0084045 0.021281 -0.50233 -0.037558 81.502 44.081 -0.42467 0.55446 37.109 -0.14922 -0.058361 0.90344 0.57915 0.22462 0.85041 0.9598 9.56 -0.0084045 -0.31411 1.042 0.12863 9.7539 8.2802 82.676 4.4148 6.1352
1 -0.40937 0.58325 0.20188 1.3461 -0.7769 0.0 -0.40937 0.71453 9.8193 0.41675 -0.25112 -0.70189 -0.024009 -0.40937 -903.02 -0.4042 1.7145 -0.40937 -0.041691 2.7925 ? -0.37487 -0.041691 -0.40937 -0.20825 -0.4042 -2.3689 0.94005 1.9031 0.016142 -0.041691 20.883 17.478 17.478 -0.37487 9.943 ? 0.41675 -0.038178 0.98264 -0.096605 -0.038178 7.8804 5.0879 -5.4493 1.1114 2.6898 -0.5485 -0.05586 1.3461 0.58325 0.057214 1.9406 1.9406 16.15 -0.03819 -0.9823 1.0253 0.0 130.71 71.739 21.681 16.835 45.724
1 -0.19899 0.42164 0.57836 2.3717 50.094 -0.20152 -0.19899 1.3717 3.9931 0.57836 -0.19797 -0.47193 -0.041069 -0.19899 -938.46 -0.38893 2.3717 -0.19899 -0.049833 0 ? -0.19831 -0.049833 -0.19899 -0.38529 -0.38893 -195.5 ? 1.772 -0.054405 -0.049833 36.719 9.9407 9.9407 -0.19831 3.9932 ? 0.57836 -0.049663 1.5152 -0.086059 -0.049663 33.009 33.009 ? 1.5152 0 -0.23331 -0.058428 2.3717 0.42164 0.1006 ? ? 34.21 -0.049621 -0.34405 1.0496 0.0 ? 11.058 38.541 9.4703 ?
1 0.14806 0.83471 -0.050636 0.8059 -43.448 -0.34617 0.16452 0.19804 0.82432 0.16531 0.22872 0.63064 0.26792 0.16452 1379.5 0.26459 1.198 0.16452 0.19958 25.982 ? 0.2287 0.17961 0.16452 -0.19827 0.24487 3.5621 -0.064112 3.5149 0.95298 0.20139 94.239 3.8957 1.2175 -0.18627 1.2451 0.28541 0.69632 -0.22597 0.21346 0.097614 0.27744 68.433 42.451 2.5232 0.43839 21.074 0.17236 0.2091 0.25187 0.26087 0.25669 0.2093 0.88164 -165.73 -0.22572 0.89565 0.81625 3.2123 14.048 8.5981 115.51 3.1599 1.0437
1 -0.092469 0.8223 -0.18051 0.75948 -200.23 0.0 -0.096245 0.2161 1.1797 0.1777 -0.059312 -0.12824 -0.046759 -0.096245 -5441.4 -0.067079 1.2161 -0.096245 -0.081588 126.65 0.98898 -0.07112 -0.078387 0.91179 0.088247 -0.062487 -1.9257 -0.41978 4.4644 0.68549 -0.058165 258.64 1.4744 1.3456 -0.079529 1.1797 5.0715 0.20938 -0.067417 0.021861 -0.91265 -0.060289 171.28 44.637 -0.22591 0.21409 135.02 -0.11221 -0.095118 0.69316 0.7505 0.67826 0.41323 0.48691 -5259 0.10219 -0.52038 0.9693 0.17829 2.882 8.177 232.21 1.5718 2.7433
1 -0.006009 0.87154 -0.30285 0.65251 -20.725 -2.1532 -0.006009 0.14746 5.5023 0.12852 0.12882 -0.0068951 0.018794 -0.006009 3076.2 0.11865 1.1474 -0.006009 -0.0010922 0.0078939 ? 0.12882 -0.0010922 -0.006009 -2.1592 0.11865 0.95543 -0.70217 2.2255 0.10544 -0.0010381 57.891 6.305 6.305 0.007199 5.6238 ? 0.12852 0.0013084 0.34244 0.12194 0.023411 17.927 17.919 -50.5 0.34257 0.0079043 0.019397 0.0035252 0.65251 0.87154 0.15861 0.29797 0.29797 -50.9 0.0013192 -0.046759 0.97709 0.0 46239 20.369 57.815 6.3133 12.757

We are in a case of supervised learning, a classification with labelled data which indicates whether the company bankrupted or not.

The goal of this project is to explore, analyse and make data visualisation before applying machine learning algorithms in order to predict and classify, with the best accuracy, the situation of a company.

In total in this dataset, we have more than 43 000 companies' status inequitably distributed on 5 years. The columns represent the 64 variables we will use to predict bankruptcy. Among these variables, we can see there is underlying variables which appear and impact a lot of variables, like total assets, total liabilities, net profit and much more that we can use directly in order to have fewer variables to include in our model without loss of information. We can also see there is multiple missing values symbolised by a "?" in many fields.

To launch the API


Start to clone the project :

git clone https://github.com/a-brice/bankruptcy-data-exp.git
cd bankruptcy-data-exp

To launch the API, you must install some dependencies first. (Window) From a shell from the root directory, enter the following :

python -m venv env
.\env\Scripts\activate
pip install -r requirement.txt

When all installations are completed (a bit long), you can now run the API :

cd api
python manage.py runserver

And after that, go to http://127.0.0.1:8000 and explore !