# Part 0: Get to Know the Data

There are four data files associated with this project:

- `Udacity_AZDIAS_052018.csv`: Demographics data for the general population of Germany; 891 211 persons (rows) x 366 features (columns).
- `Udacity_CUSTOMERS_052018.csv`: Demographics data for customers of a mail-order company; 191 652 persons (rows) x 369 features (columns).
- `Udacity_MAILOUT_052018_TRAIN.csv`: Demographics data for individuals who were targets of a marketing campaign; 42 982 persons (rows) x 367 (columns).
- `Udacity_MAILOUT_052018_TEST.csv`: Demographics data for individuals who were targets of a marketing campaign; 42 833 persons (rows) x 366 (columns).

Each row of the demographics files represents a single person, but also includes information outside of individuals, including information about their household, building, and neighborhood. Use the information from the first two files to figure out how customers ("CUSTOMERS") are similar to or differ from the general population at large ("AZDIAS"), then use your analysis to make predictions on the other two files ("MAILOUT"), predicting which recipients are most likely to become a customer for the mail-order company.

The "CUSTOMERS" file contains three extra columns ('CUSTOMER_GROUP', 'ONLINE_PURCHASE', and 'PRODUCT_GROUP'), which provide broad information about the customers depicted in the file. The original "MAILOUT" file included one additional column, "RESPONSE", which indicated whether or not each recipient became a customer of the company. For the "TRAIN" subset, this column has been retained, but in the "TEST" subset it has been removed; it is against that withheld column that your final predictions will be assessed in the Kaggle competition.

Otherwise, all of the remaining columns are the same between the three data files. For more information about the columns depicted in the files, you can refer to two Excel spreadsheets provided in the workspace. [One of them](./DIAS Information Levels - Attributes 2017.xlsx) is a top-level list of attributes and descriptions, organized by informational category. [The other](./DIAS Attributes - Values 2017.xlsx) is a detailed mapping of data values for each feature in alphabetical order.

In the below cell, we've provided some initial code to load in the first two datasets. Note for all of the `.csv` data files in this project that they're semicolon (`;`) delimited, so an additional argument in the [`read_csv()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) call has been included to read in the data properly. Also, considering the size of the datasets, it may take some time for them to load completely.

You'll notice when the data is loaded in that a warning message will immediately pop up. Before you really start digging into the modeling and analysis, you're going to need to perform some cleaning. Take some time to browse the structure of the data and look over the informational spreadsheets to understand the data values. Make some decisions on which features to keep, which features to drop, and if any revisions need to be made on data formats. It'll be a good idea to create a function with pre-processing steps, since you'll need to clean all of the datasets before you work with them.

# 1. Importing libraries and Data

In [None]:
# import libraries here; add more as necessary
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# magic word for producing visualizations in notebook
%matplotlib inline

In [None]:
azdias = pd.read_csv('Udacity_AZDIAS_052018.csv', sep=';')
customers = pd.read_csv('Udacity_CUSTOMERS_052018.csv', sep=';')
customers=customers.drop(['PRODUCT_GROUP','CUSTOMER_GROUP','ONLINE_PURCHASE'],axis=1)

In [None]:
# As per error need to check this columns for incosistency, handled in data preprocessing step
azdias.columns[18:20]

### Top 3 rows of azdias and customer

In [None]:
print("shape of azdias dataset",azdias.shape)
azdias.head(3)

In [None]:
print("shape of customers dataset",customers.shape)
customers.head(3)

# 2. Exploratory Data Analysis

## 2.1 Descriptive Statistics

In [None]:
azdias.describe()

In [None]:
customers.describe()

## 2.2 Missing Values

#### Earlier we can saw a lot of columns with NaN values. Let explore more about the missing values

In [None]:
# creating a dataframe to get count/percentage of missing values in azdias and customers
azdias_nan=pd.DataFrame(azdias.isna().sum(axis=0)).reset_index()
azdias_nan.columns=['column','azdias_nan']
azdias_nan['azdiaz_nan_percent']=(azdias_nan['azdias_nan']/len(azdias))*100

customers_nan=pd.DataFrame(customers.isna().sum(axis=0)).reset_index()
customers_nan.columns=['column','customers_nan']
customers_nan['customer_nan_percent']=(customers_nan['customers_nan']/len(customers))*100
missing_values=pd.merge(azdias_nan, customers_nan, on='column')
missing_values=missing_values.sort_values(by=['azdias_nan','customers_nan'],ascending=[False,False])
missing_values

#### Ploting group barchart

In [None]:
plt.figure(figsize=(20,8))
# set width of bar
barWidth = 0.25
# set height of bar
bars1 = missing_values['azdiaz_nan_percent'][:10]
bars2 = missing_values['customer_nan_percent'][:10]
label=missing_values['column'][:10]
# Set position of bar on X axis
r1 = np.arange(len(bars1))
r2 = [x + barWidth for x in r1]
 
# Make the plot
plt.bar(r1, bars1, width=barWidth, edgecolor='white')
plt.bar(r2, bars2, width=barWidth, edgecolor='white')

 
# Add xticks on the middle of the group bars
plt.xlabel('columns')
plt.ylabel('Percentage')
plt.xticks([r + barWidth/2 for r in range(len(bars1))], label,rotation=50)
 
# Create legend & Show graphic
plt.title('Percentage of missing values sorted - columns wise')
plt.legend(['azdiaz_nan_percent','customer_nan_percent'],prop={'size': 12})
plt.show()


##### we see that these columns have more than 90 % of missing data

In [None]:
# we will drop these columns, creating a variable to do so
drop_cols=list(label[:4])
drop_cols

In [None]:
row_na=pd.DataFrame(azdias.isna().sum(axis=1))
row_na.columns=['na_count']
row_na['percent_information_retained']=((azdias.shape[1]-row_na['na_count'])/azdias.shape[1])*100
row_na

In [None]:
a=np.arange(0,azdias.shape[1])
b=a/azdias.shape[1]
plt.figure(figsize=(10,8))
plt.plot(a,b[::-1])
plt.xlabel('Count of missing data')
plt.ylabel('Percentage of missing data')
plt.title('Data Retention rate with missing value')
plt.show()

**if a row has 20 missing values it retains 94.26% of data**

In [None]:
b[::-1][20]

#### Distribution of Information retained data

In [None]:
plt.figure(figsize=(16,8))
row_na_distribution=row_na.groupby(['percent_information_retained']).count()
plt.bar(row_na_distribution.index,row_na_distribution.na_count)
plt.xlabel('Percentage of data')
plt.ylabel('Count of data')
plt.title('Distribution of Data rentention')
plt.show()

**Observation :** *We can see that majority of samples retains more that 90%*

# 3. Data Preprocessing

## 3.0 Making data consistent
1. Uniforming Numeric Data type from float and int to only int as in documentation
2. Next we need to check if data present in dataset is as per DIAS Attribute values

### 3.0.1 Uniforming Numeric Data type from float and int to only int as in documentation

In [None]:
# Getting all numeric columns and categorical columns
Numeric_columns=azdias.select_dtypes(include=np.number).columns.tolist()
categorical_col=set(azdias.columns).difference(set(Numeric_columns))
print("Length of Numeric columns",len(Numeric_columns))
print("Length of categorical columns",len(categorical_col))
print("Categorial columns are",categorical_col)

In [None]:
# azdias[Numeric_columns] = azdias[Numeric_columns].astype('Int64')
# customers[Numeric_columns] = customers[Numeric_columns].astype('Int64')
azdias[Numeric_columns] = azdias[Numeric_columns].apply(pd.to_numeric)
customers[Numeric_columns] = customers[Numeric_columns].apply(pd.to_numeric)

In [None]:
print("Azdias\n",azdias.dtypes.head())
print("\t")
print("Customers\n",customers.dtypes.head())

### 3.0.2 Next we need to check if data present in dataset is as per DIAS Attribute values

In [None]:
# reading the DIAS Attributes
dias_df=pd.read_excel('DIAS Attributes - Values 2017.xlsx',skiprows=1,usecols=['Attribute','Description','Value','Meaning'])
dias_df=dias_df.fillna(method='ffill')
dias_df.head(6)

#### 3.0.2.a Numeric Data check

In [None]:
# Checking numeric columns first - so taking intersection of common columns
Attribute=set(azdias.columns).intersection(dias_df.Attribute).intersection(Numeric_columns)

In [None]:
inconsistent=[]
def consistency_check():
    '''
    Input:.. 
    Output:..
    Function: 
        For each attribute checks which values are allowed according to DIAS documentation.
        This way we can find out if any attribute has values other than mentioned
    Returns:
        List of columns we need to check before continuing
    '''
    for i in Attribute: 
        #print("starting computation for",i)
        try:
            con=list(dias_df[dias_df['Attribute']==i]['Value'])
            for m in con:
                if type(m)==str:
                    con.remove(m)
                    con=con+list(map(int, m.split(',')))
                else:
                    pass

            if len(set(azdias[i].dropna().unique()).difference(set(con)))==0:
                # contains data as per documentation
                #print("passsed",i)
                pass
            else:
                print('--------- Failed ---------')
                print(i,set(azdias[i].dropna().unique()).difference(set(con)))
                inconsistent.append(i)
        except:
            inconsistent.append(i)
    return inconsistent
consistency_check()

**NOTE**:  Upon checking above columns with the documentaion we have found that these values should not be in the column.
I believe that {0} and {6} have been added to dataset value for that column is unknown. So, we will moving forward with this consideration in mind


#### 3.0.2.b Categorical data check

In [None]:
# Now lets check the categorical columns
category={}
def consistency_check_cat():
    for i in categorical_col:
        category[i]=azdias[i].dropna().unique()
    return category
consistency_check_cat()

**NOTE :** - Upon checking with documentation, 
- EINGEFUEGT_AM - drop for now as when one hot coding will create lot of columns
- D19_LETZTER_KAUF_BRANCHE - is refering to other columns will drop for now as when one hot coding will create lot of columns
- CAMEO_DEUG_2015 should not have 'X', we need to replace with 0 as unknown
- CAMEO_INTL_2015 should not have 'XX, we need to replace with 0 as unknown
- CAMEO_DEU_2015 should not have 'XX', we need to replace with 0 as unknown

In [None]:
azdias=azdias.drop('D19_LETZTER_KAUF_BRANCHE',axis=1)
customers=customers.drop('D19_LETZTER_KAUF_BRANCHE',axis=1)

azdias=azdias.drop('EINGEFUEGT_AM',axis=1)
customers=customers.drop('EINGEFUEGT_AM',axis=1)

In [None]:
azdias[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']]=azdias[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']].replace(['X','XX'],-1)
azdias[['CAMEO_DEUG_2015','CAMEO_INTL_2015']]=azdias[['CAMEO_DEUG_2015','CAMEO_INTL_2015']].apply(pd.to_numeric)

customers[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']]=customers[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']].replace(['X','XX'],-1)
customers[['CAMEO_DEUG_2015','CAMEO_INTL_2015']]=customers[['CAMEO_DEUG_2015','CAMEO_INTL_2015']].apply(pd.to_numeric)


## 3.1 Data uniformity

*Also having 0,-1 both to represent unknown can cause ambiguity. We should work on this issue by replacing 0 with -1 in the column*

#### Unknown is represented as (-1,0) or (-1,9) in the dataset. 
*Also having 0,-1 both to represent unknown can cause ambiguity. We should work on this issue by replacing 0 with -1 in the column*

- We should have only one either of [-1 or 0] in a column to have consistency.
- We should have only one either of [-1 or 9] in a column to have consistency. 

Handling this condition

In [None]:
ambiguity_10=list(set(list(dias_df[dias_df['Value']=='-1, 0']['Attribute'])).intersection(set(azdias.columns)))
ambiguity_90=list(set(list(dias_df[dias_df['Value']=='-1, 9']['Attribute'])).intersection(set(azdias.columns)))

In [None]:
print(ambiguity_10)
print(ambiguity_90)

In [None]:
azdias[ambiguity_10]=azdias[ambiguity_10].replace(0,-1)
azdias[ambiguity_90]=azdias[ambiguity_90].replace(9,-1)

customers[ambiguity_10]=customers[ambiguity_10].replace(0,-1)
customers[ambiguity_90]=customers[ambiguity_90].replace(9,-1)


## 3.2 Handling Missing values

### 3.2.1 Droping data where more than 95% of data missing

#### Rows

In [None]:
# removing rows having retension percentage less than 94%
new_azdias=azdias[azdias.isnull().sum(axis=1)<=20]

new_customers=customers[customers.isnull().sum(axis=1)<=20]

#### Columns

In [None]:
#droping columns which have more than 90 % of missing data
#drop_cols=missing_values['column'][:4]
new_azdias=new_azdias.drop(drop_cols,axis=1)

new_customers=customers.drop(drop_cols,axis=1)

In [None]:
print("shape of azdias dataset before is",azdias.shape)
print("shape of customers dataset before is",customers.shape)
print("\n")
print("shape of azdias dataset after is",new_azdias.shape)
print("shape of customers dataset after is",new_customers.shape)

### 3.2.2 Replacing Nan 
 **NOTE** :DIAS Attributes tells us about the attributes. It has 0,-1 representatins for unknown. But,Nan does not mean unknown.
NaN could have been caused due many factors: like human error etc

So we will not fillna() to replace with -1 yet.

##### ToDo : To experiment and check how using fillna to replace nan chnages effciency

How to handle missing values:
- Replace with mode : Viable solution in our situation
- Replace with mean : filling with mean will introduce decimals into dataset, can effect efficiency 
- Replace with ffill : Not viable solution
- Replace with bfill : Not viable solution
- etc.


In [None]:
new_azdias=new_azdias.fillna(new_azdias.mode().iloc[0])
new_customers=new_customers.fillna(new_customers.mode().iloc[0])

In [None]:
new_azdias.isna().sum().sum()

In [None]:
new_customers.isna().sum().sum()

### 3.2.3 One hot encoding categorical columns

In [None]:
new_customers.dtypes.unique()
new_customers.select_dtypes(include ='O').columns

In [None]:
new_azdias=pd.get_dummies(new_azdias)
new_customers=pd.get_dummies(new_customers)


In [None]:
print("new shape of azdias after one hot encoding",new_azdias.shape)
print("new shape of customers after one hot encoding",new_customers.shape)

## 3.3 Standardizing the Dataset

In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
new_azdias[new_azdias.columns] = scaler.fit_transform(new_azdias)
new_customers[new_customers.columns] = scaler.fit_transform(new_customers)

In [None]:
new_azdias.shape

In [None]:
new_customers.shape

In [None]:
(745305, 405)

(191652, 405)

# We need to refactor all the above tasks to avoid repetition

In [None]:
# import libraries here; add more as necessary
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# magic word for producing visualizations in notebook
%matplotlib inline

In [2]:
azdias = pd.read_csv('Udacity_AZDIAS_052018.csv', sep=';')
customers = pd.read_csv('Udacity_CUSTOMERS_052018.csv', sep=';')

  interactivity=interactivity, compiler=compiler, result=result)


In [3]:
ambiguity_10=['WOHNDAUER_2008', 'KKK', 'W_KEIT_KIND_HH', 'ALTERSKATEGORIE_GROB', 'NATIONALITAET_KZ', 'PRAEGENDE_JUGENDJAHRE', 'KBA05_GBZ', 'TITEL_KZ', 'REGIOTYP', 'KBA05_BAUMAX', 'HH_EINKOMMEN_SCORE', 'GEBAEUDETYP', 'ANREDE_KZ']
ambiguity_90=['KBA05_HERST1', 'KBA05_ALTER4', 'KBA05_SEG3', 'SEMIO_KRIT', 'KBA05_KRSKLEIN', 'KBA05_MOD4', 'KBA05_HERST5', 'KBA05_SEG4', 'KBA05_ZUL3', 'KBA05_ZUL4', 'SEMIO_PFLICHT', 'KBA05_KRSZUL', 'KBA05_FRAU', 'KBA05_KRSAQUOT', 'KBA05_MAXBJ', 'KBA05_ZUL1', 'KBA05_MOTOR', 'KBA05_KW3', 'KBA05_MAXAH', 'KBA05_MOD3', 'SEMIO_SOZ', 'KBA05_MODTEMP', 'KBA05_SEG1', 'KBA05_SEG9', 'SEMIO_FAM', 'SEMIO_KAEM', 'KBA05_ALTER1', 'KBA05_DIESEL', 'KBA05_HERST4', 'KBA05_MAXSEG', 'KBA05_MOTRAD', 'RELAT_AB', 'SEMIO_ERL', 'KBA05_MOD2', 'KBA05_VORB0', 'KBA05_VORB2', 'KBA05_ALTER3', 'KBA05_KW2', 'KBA05_ANHANG', 'KBA05_MOD8', 'KBA05_SEG6', 'KBA05_SEG5', 'KBA05_KRSOBER', 'KBA05_AUTOQUOT', 'KBA05_ALTER2', 'SEMIO_VERT', 'KBA05_KW1', 'KBA05_SEG8', 'KBA05_VORB1', 'ZABEOTYP', 'KBA05_KRSHERST3', 'KBA05_SEG10', 'KBA05_CCM4', 'SEMIO_LUST', 'KBA05_KRSHERST1', 'KBA05_KRSVAN', 'KBA05_MAXHERST', 'KBA05_SEG7', 'SEMIO_RAT', 'KBA05_CCM3', 'KBA05_CCM1', 'KBA05_CCM2', 'KBA05_ZUL2', 'KBA05_MAXVORB', 'KBA05_SEG2', 'KBA05_KRSHERST2', 'SEMIO_KULT', 'KBA05_HERSTTEMP', 'SEMIO_REL', 'SEMIO_DOM', 'SEMIO_TRADV', 'KBA05_HERST3', 'KBA05_HERST2', 'KBA05_MOD1', 'SEMIO_MAT']


In [4]:
from sklearn.preprocessing import StandardScaler
def preprocess(df,name=None):
    print("Shape before",df.shape)
    if name=='customers':
        df=df.drop(['PRODUCT_GROUP','CUSTOMER_GROUP','ONLINE_PURCHASE'],axis=1)
    
    if name=='azdias':
        df=df[df.isnull().sum(axis=1)<=20].reset_index(drop=True)
    # finding numeric and categorical columns

    Numeric_columns=df.select_dtypes(include=np.number).columns.tolist()
    categorical_col=set(df.columns).difference(set(Numeric_columns))
    # numeric cols to numeric
    print(categorical_col)
    df[Numeric_columns]=df[Numeric_columns].apply(pd.to_numeric)
    # drop columns to avoid lot of columns
    df=df.drop('D19_LETZTER_KAUF_BRANCHE',axis=1)
    df=df.drop('EINGEFUEGT_AM',axis=1)
    #
    df[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']]=df[['CAMEO_DEUG_2015','CAMEO_INTL_2015','CAMEO_DEU_2015']].replace(['X','XX'],-1)
    df[['CAMEO_DEUG_2015','CAMEO_INTL_2015']]=df[['CAMEO_DEUG_2015','CAMEO_INTL_2015']].apply(pd.to_numeric)
    #

    df[ambiguity_10]=df[ambiguity_10].replace(0,-1)
    df[ambiguity_90]=df[ambiguity_90].replace(9,-1)
    df=df.drop(['ALTER_KIND4', 'ALTER_KIND3', 'ALTER_KIND2', 'ALTER_KIND1'],axis=1)
    df=df.fillna(df.mode().iloc[0])
    print("Number of nan Values",df.isna().sum().sum())
    df=pd.get_dummies(df)
    
    scaler = StandardScaler()
    df[df.columns] = scaler.fit_transform(df)
    print("shape after",df.shape)
    df = df.set_index('LNR')
    return df

In [5]:
scaled_azdias=preprocess(azdias,'azdias')

Shape before (891221, 366)
{'OST_WEST_KZ', 'D19_LETZTER_KAUF_BRANCHE', 'CAMEO_DEU_2015', 'EINGEFUEGT_AM', 'CAMEO_DEUG_2015', 'CAMEO_INTL_2015'}
Number of nan Values 0
shape after (744305, 405)


In [6]:
scaled_customers=preprocess(customers,'customers')

Shape before (191652, 369)
{'OST_WEST_KZ', 'D19_LETZTER_KAUF_BRANCHE', 'CAMEO_DEU_2015', 'EINGEFUEGT_AM', 'CAMEO_DEUG_2015', 'CAMEO_INTL_2015'}
Number of nan Values 0
shape after (191652, 405)


In [7]:
scaled_azdias.head()

Unnamed: 0_level_0,AGER_TYP,AKT_DAT_KL,ALTER_HH,ALTERSKATEGORIE_FEIN,ANZ_HAUSHALTE_AKTIV,ANZ_HH_TITEL,ANZ_KINDER,ANZ_PERSONEN,ANZ_STATISTISCHE_HAUSHALTE,ANZ_TITEL,...,CAMEO_DEU_2015_8B,CAMEO_DEU_2015_8C,CAMEO_DEU_2015_8D,CAMEO_DEU_2015_9A,CAMEO_DEU_2015_9B,CAMEO_DEU_2015_9C,CAMEO_DEU_2015_9D,CAMEO_DEU_2015_9E,OST_WEST_KZ_O,OST_WEST_KZ_W
LNR,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.060104,-0.573138,1.249922,-1.419539,1.560724,0.158256,-0.128594,-0.297023,0.232945,0.291347,-0.060494,...,-0.211823,-0.205615,-0.153395,-0.166375,-0.19337,-0.183414,-0.196412,-0.087037,-0.519892,0.519892
1.060123,-0.573138,1.249922,0.808202,0.67526,0.094457,-0.128594,-0.297023,-0.627017,-0.055617,-0.060494,...,-0.211823,-0.205615,-0.153395,-0.166375,-0.19337,-0.183414,-0.196412,-0.087037,-0.519892,0.519892
1.060127,1.857451,-0.942389,0.284028,-0.210205,-0.479738,-0.128594,-0.297023,-1.486979,-0.402581,-0.060494,...,-0.211823,-0.205615,-0.153395,-0.166375,-0.19337,-0.183414,-0.196412,-0.087037,-0.519892,0.519892
1.060185,-0.573138,-0.942389,1.201333,0.011161,-0.35214,-0.128594,-0.297023,1.952869,-0.333188,-0.060494,...,-0.211823,-0.205615,-0.153395,-0.166375,-0.19337,-0.183414,-0.196412,-0.087037,-0.519892,0.519892
1.060197,2.667648,-0.942389,-0.109103,-0.874304,-0.224541,-0.128594,-0.297023,-0.627017,-0.402581,-0.060494,...,-0.211823,4.863456,-0.153395,-0.166375,-0.19337,-0.183414,-0.196412,-0.087037,-0.519892,0.519892


In [8]:
scaled_customers.head()

Unnamed: 0_level_0,AGER_TYP,AKT_DAT_KL,ALTER_HH,ALTERSKATEGORIE_FEIN,ANZ_HAUSHALTE_AKTIV,ANZ_HH_TITEL,ANZ_KINDER,ANZ_PERSONEN,ANZ_STATISTISCHE_HAUSHALTE,ANZ_TITEL,...,CAMEO_DEU_2015_8B,CAMEO_DEU_2015_8C,CAMEO_DEU_2015_8D,CAMEO_DEU_2015_9A,CAMEO_DEU_2015_9B,CAMEO_DEU_2015_9C,CAMEO_DEU_2015_9D,CAMEO_DEU_2015_9E,OST_WEST_KZ_O,OST_WEST_KZ_W
LNR,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
-1.55807,1.189681,-0.325074,0.192466,-0.068433,-0.235979,-0.105218,-0.238379,-0.166805,-0.222434,-0.116283,...,-0.142037,-0.121786,-0.10791,-0.073256,-0.069338,-0.071949,-0.10157,-0.089055,-0.250816,0.250816
-1.558034,-0.966005,4.271391,0.329163,-0.068433,-0.235979,-0.105218,-0.238379,0.656067,-0.222434,-0.116283,...,-0.142037,-0.121786,-0.10791,-0.073256,-0.069338,-0.071949,-0.10157,-0.089055,-0.250816,0.250816
0.86842,-0.966005,-0.325074,-0.354319,-2.897572,-0.235979,-0.105218,-0.238379,-0.989677,-0.222434,-0.116283,...,-0.142037,-0.121786,-0.10791,-0.073256,-0.069338,-0.071949,-0.10157,-0.089055,-0.250816,0.250816
0.868438,0.471119,-0.325074,-0.080926,-0.634261,-0.316443,-0.105218,-0.238379,-1.812549,-0.222434,-0.116283,...,-0.142037,-0.121786,-0.10791,-0.073256,-0.069338,-0.071949,-0.10157,-0.089055,-0.250816,0.250816
0.868456,-0.966005,-0.325074,1.559431,1.063223,0.246806,-0.105218,-0.238379,1.478939,0.265169,-0.116283,...,-0.142037,-0.121786,-0.10791,-0.073256,-0.069338,-0.071949,-0.10157,-0.089055,-0.250816,0.250816
