# Examples

## Input data

Data can be read from a string, by using the `ImportString` class of `explain.dataio`. The default delimiter is `\s` the spectial charactere for whitespaces.

In [2]:
from explann.dataio import ImportString

data_string = """
Observação	Dureza	Temperatura
1	137	220
2	137	220
3	137	220
4	136	220
5	135	220
6	135	225
7	133	225
8	132	225
9	133	225
10	133	225
11	128	230
12	124	230
13	126	230
14	129	230
15	126	230
16	122	235
17	122	235
18	122	235
19	119	235
20	122	235
"""

data_reader_string = ImportString(data=data_string, delimiter="\s")

`data_reader` object stores the providade data in its `.data` attribute

In [3]:
data_reader_string.data

Unnamed: 0,Observação,Dureza,Temperatura
0,1,137,220
1,2,137,220
2,3,137,220
3,4,136,220
4,5,135,220
5,6,135,225
6,7,133,225
7,8,132,225
8,9,133,225
9,10,133,225


Data can also be read from a `.xlsx` (Excel extension) file. To do so, use the `ImportXLSX` class of `explain.dataio`. Default is to read the first `Sheet`, otherwise desired provide the additional argument `sheet_name`

In [4]:
from explann.dataio import ImportXLSX

data_reader_xlsx = ImportXLSX(path="../../data/paper_data_24.xlsx")

data_reader_xlsx.data

Unnamed: 0,U,A,P,Y,F,C,B
0,-1,-1,-1,-1,39,1.328,170
1,1,-1,-1,-1,87,1.699,122
2,-1,1,-1,-1,48,1.332,473
3,1,1,-1,-1,71,1.979,511
4,-1,-1,1,-1,43,1.458,156
5,1,-1,1,-1,84,2.189,204
6,-1,1,1,-1,45,1.343,385
7,1,1,1,-1,112,1.707,288
8,-1,-1,-1,1,19,1.257,114
9,1,-1,-1,1,146,2.148,116


Any importer has the functionality to parse levels of an factorial design such as the above one.

In [5]:
levels_string = """
Levels;U;A;P;Y
-1;0.15;0.7; 0.40;0.13
0; 0.30;1.4; 0.75;0.26
1; 0.45;2.1; 1.10;0.38
"""

levels_reader = ImportString(
    data = levels_string, 
    delimiter = ";",
    index_col = 0  # should pass the column name or index containing the level.
)
levels_reader.data

Unnamed: 0_level_0,U,A,P,Y
Levels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
-1,0.15,0.7,0.4,0.13
0,0.3,1.4,0.75,0.26
1,0.45,2.1,1.1,0.38


The same data should be imported from a `.xlsx` file.

In [6]:
levels_reader_xlsx = ImportXLSX(
    path="../../data/paper_data_24.xlsx",
    sheet_name="Levels",
    index_col=0,
)
levels_reader_xlsx.data

Unnamed: 0_level_0,U,A,P,Y
Levels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
-1,0.15,0.7,0.4,0.13
0,0.3,1.4,0.75,0.26
1,0.45,2.1,1.1,0.38


The data reader `parse_levels` acept a `pd.Dataframe` constructed from any one of the methods above, you can pass `string` or `.xlsx` files to the methos `parse_levels_from_string` and `parse_levels_from_xlsx`.

In [7]:
# passing a pd.DataFrame
data_reader_xlsx.parse_levels(
    data = levels_reader.data
)

# passing a string
data_reader_xlsx.parse_levels_from_string(
    data = levels_string, 
    delimiter=";"
)

# passing a path
data_reader_xlsx.parse_levels_from_xlsx(
    data = "../../data/paper_data_24.xlsx", 
    sheet_name = "Levels",
    index_col=0,
)


The results are the same, `data` attribute has its values parsed to the corresponding index levels for each variable as described in the `levels_reades_<type>.data` table. 

In [8]:
data_reader_xlsx.data

Unnamed: 0,U,A,P,Y,F,C,B
0,-1,0.7,0.4,0.13,39,1.328,170
1,1,0.7,0.4,0.13,87,1.699,122
2,-1,2.1,0.4,0.13,48,1.332,473
3,1,2.1,0.4,0.13,71,1.979,511
4,-1,0.7,1.1,0.13,43,1.458,156
5,1,0.7,1.1,0.13,84,2.189,204
6,-1,2.1,1.1,0.13,45,1.343,385
7,1,2.1,1.1,0.13,112,1.707,288
8,-1,0.7,0.4,0.38,19,1.257,114
9,1,0.7,0.4,0.38,146,2.148,116


Original data is retained in a `raw_data` attribute

In [9]:
data_reader_xlsx.raw_data

Unnamed: 0,U,A,P,Y,F,C,B
0,-1,-1,-1,-1,39,1.328,170
1,1,-1,-1,-1,87,1.699,122
2,-1,1,-1,-1,48,1.332,473
3,1,1,-1,-1,71,1.979,511
4,-1,-1,1,-1,43,1.458,156
5,1,-1,1,-1,84,2.189,204
6,-1,1,1,-1,45,1.343,385
7,1,1,1,-1,112,1.707,288
8,-1,-1,-1,1,19,1.257,114
9,1,-1,-1,1,146,2.148,116
