# DataTypeSystem
### ***Python package***

This Python package provides a type system for different data structures that are 
coercible to full arrays. It is Python translation of the code of the Raku package
["Data::Reshapers"](https://github.com/antononcube/Raku-Data-Reshapers), [AAp1].

------

## Installation

### Install from GitHub

```shell
pip install -e git+https://github.com/antononcube/Python-packages.git#egg=DataTypeSystem-antononcube\&subdirectory=DataTypeSystem
```

### From PyPi

```shell
pip install DataTypeSystem
```

------

## Usage examples

The type system conventions follow those of Mathematica's 
[`Dataset`](https://reference.wolfram.com/language/ref/Dataset.html) 
-- see the presentation 
["Dataset improvements"](https://www.wolfram.com/broadcast/video.php?c=488&p=4&disp=list&v=3264).

Here we get the Titanic dataset, change the "passengerAge" column values to be numeric, 
and show dataset's dimensions:

In [14]:
import pandas

dfTitanic = pandas.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/titanic.csv')
dfTitanic = dfTitanic[["sex", "age", "pclass", "survived"]]
dfTitanic = dfTitanic.rename(columns ={"pclass": "class"})
dfTitanic.shape

(891, 4)

Here is a sample of dataset's records:

In [15]:
from DataTypeSystem import *

dfTitanic.sample(3)

Unnamed: 0,sex,age,class,survived
555,male,62.0,1,0
278,male,7.0,3,0
266,male,16.0,3,0


Here is the type of a single record:

In [3]:
deduce_type(dfTitanic.iloc[12].to_dict())

Struct([age, class, sex, survived], [float, int, str, int])

Here is the type of single record's values:

In [4]:
deduce_type(dfTitanic.iloc[12].to_dict().values())

Tuple([Atom(<class 'str'>), Atom(<class 'float'>), Atom(<class 'int'>), Atom(<class 'int'>)])

Here is the type of the whole dataset:

In [5]:
deduce_type(dfTitanic.to_dict())

Assoc(Atom(<class 'str'>), Assoc(Atom(<class 'int'>), Atom(<class 'str'>), 891), 4)

Here is the type of "values only" records:

In [12]:
valArr = dfTitanic.transpose().to_dict().values()
deduce_type(valArr)

Vector(Struct([age, class, sex, survived], [float, int, str, int]), 891)

-------

## References

[AAp1] Anton Antonov,
[Data::TypeSystem Raku package](https://github.com/antononcube/Raku-Data-TypeSystem),
(2023),
[GitHub/antononcube](https://github.com/antononcube/).