In this notebook you will get information about **Lemuras** data types and how to read and write **Table** objects in and from **CSV, SQL, JSON** and **HTML** formats.

### Sample data

In [1]:
from lemuras import Table
from datetime import datetime, timedelta

def mkdt(ds):
    return datetime.now() - timedelta(days=ds)

cols = ['type', 'size', 'weight', 'when', 'tel']
rows = [
  ['A', 1, 12, mkdt(3), '+79360360193'],
  ['B', 4, 12, mkdt(33), 84505505151],
  ['A', 3, 10, mkdt(48), '+31415926535'],
  ['B', 6, 14, mkdt(333), 0],
  ['A', 4, 15, mkdt(209), None],
  ['A', 2, 11, mkdt(192), ''],
]
df1 = Table(cols, rows, 'Sample')
df1

'type','size','weight','when','tel'
'A',1,12,2018-02-27 03:09:14.492055,'+79360360193'
'B',4,12,2018-01-28 03:09:14.492080,84505505151
'A',3,10,2018-01-13 03:09:14.492085,'+31415926535'
'B',6,14,2017-04-03 03:09:14.492088,0
'A',4,15,2017-08-05 03:09:14.492092,
'A',2,11,2017-08-22 03:09:14.492096,''


# Data types

Lemuras Table object consists of native Python lists, so, it can contain any objects that support **str** and **repr**. However, there is advanced built-in support for handling such types:

- **int** – identifier **`i`**.
- **float** – identifier **`f`**.
- **str** – identifier **`s`**.
- **date** – identifier **`d`**.
- **datetime** – identifier **`t`**.

Also, you can meet the identifier **`m`** which means that a column has multiple or mixed types.

The **`.get_type()`** method of **Column** object return tuple with the type identifier and maximum needed symbols length:

In [2]:
df1['weight'].get_type()

('i', 2)

The **`.find_types()`** method of **Table** object return new **Table** with types of each column:

In [3]:
df1.find_types()

'Column','Type','Symbols'
'type','s',1
'size','i',1
'weight','i',2
'when','t',26
'tel','m',12


# Save CSV

Use **`.to_csv`** method to save the data as comma-separated-values. And if you specify an argument, it will be used as a filename for saving the result.

In [4]:
txt = df1.to_csv()
txt

'type,size,weight,when,tel\nA,1,12,2018-02-27 03:09:14.492055,+79360360193\nB,4,12,2018-01-28 03:09:14.492080,84505505151\nA,3,10,2018-01-13 03:09:14.492085,+31415926535\nB,6,14,2017-04-03 03:09:14.492088,0\nA,4,15,2017-08-05 03:09:14.492092,None\nA,2,11,2017-08-22 03:09:14.492096,'

# Load CSV

Use **`.from_csv`** class method to create a Table object with given CSV data.

By default the first argument will be considered as a file name for data loading. You can change this behaviour useing **`inline=True`** argument:

In [5]:
df2 = Table.from_csv(txt, inline=True)
df2

'type','size','weight','when','tel'
'A',1,12,2018-02-27 03:09:14,79360360193.0
'B',4,12,2018-01-28 03:09:14,84505505151.0
'A',3,10,2018-01-13 03:09:14,31415926535.0
'B',6,14,2017-04-03 03:09:14,0.0
'A',4,15,2017-08-05 03:09:14,
'A',2,11,2017-08-22 03:09:14,


The types detected are the same as before serialization:

In [6]:
df2.find_types()

'Column','Type','Symbols'
'type','s',1
'size','i',1
'weight','i',2
'when','t',19
'tel','m',11


Also, you can specify a value to replace **None** values:

In [7]:
df2 = Table.from_csv(txt, inline=True, empty=0)
df2

'type','size','weight','when','tel'
'A',1,12,2018-02-27 03:09:14,79360360193
'B',4,12,2018-01-28 03:09:14,84505505151
'A',3,10,2018-01-13 03:09:14,31415926535
'B',6,14,2017-04-03 03:09:14,0
'A',4,15,2017-08-05 03:09:14,0
'A',2,11,2017-08-22 03:09:14,0


Or, you can disable preprocessing of the data to leave the values as strings:

In [8]:
df2 = Table.from_csv(txt, inline=True, preprocess=False)
df2

'type','size','weight','when','tel'
'A','1','12','2018-02-27 03:09:14.492055','+79360360193'
'B','4','12','2018-01-28 03:09:14.492080','84505505151'
'A','3','10','2018-01-13 03:09:14.492085','+31415926535'
'B','6','14','2017-04-03 03:09:14.492088','0'
'A','4','15','2017-08-05 03:09:14.492092','None'
'A','2','11','2017-08-22 03:09:14.492096',''


# Save SQL

Using Lemuras, you can work with SQL! You can extract table creation code for SQL. It uses automatic detection of columns types that was described earlier.

In [9]:
sql_cr = df1.to_sql_create()
print(sql_cr)

CREATE TABLE `Sample` (
  `type` varchar(1),
  `size` int(1),
  `weight` int(1),
  `when` datetime,
  `tel` varchar(12)
) ;


And get the code to fill the data:

In [10]:
sql_vals = df1.to_sql_values()
print(sql_vals)

INSERT INTO `Sample` VALUES ('A',1,12,'2018-02-27 03:09:14.492055','+79360360193'), ('B',4,12,'2018-01-28 03:09:14.492080','84505505151'), ('A',3,10,'2018-01-13 03:09:14.492085','+31415926535'), ('B',6,14,'2017-04-03 03:09:14.492088','0'), ('A',4,15,'2017-08-05 03:09:14.492092','None'), ('A',2,11,'2017-08-22 03:09:14.492096','');


# Load SQL

Firstly, load the table declaration to retrieve the structure:

In [11]:
df2 = Table.from_sql_create(sql_cr)
df2

'type','size','weight','when','tel'


Then, supply the data:

In [12]:
df2.add_sql_values(sql_vals)
df2

'type','size','weight','when','tel'
'A',1,12,2018-02-27 03:09:14,79360360193.0
'B',4,12,2018-01-28 03:09:14,84505505151.0
'A',3,10,2018-01-13 03:09:14,31415926535.0
'B',6,14,2017-04-03 03:09:14,0.0
'A',4,15,2017-08-05 03:09:14,
'A',2,11,2017-08-22 03:09:14,


# Save JSON

You can save a Table object to JSON string with rows as lists (by default) and set **`pretty`** to get more readable text:

In [13]:
s = df1.to_json(pretty=True)
print(s)

{
  "columns": [
    "type", "size", "weight", "when", "tel"
  ], 
  "rows": [
    [
      "A", 1, 12, "2018-02-27 03:09:14.492055", "+79360360193"
    ], [
      "B", 4, 12, "2018-01-28 03:09:14.492080", 84505505151
    ], [
      "A", 3, 10, "2018-01-13 03:09:14.492085", "+31415926535"
    ], [
      "B", 6, 14, "2017-04-03 03:09:14.492088", 0
    ], [
      "A", 4, 15, "2017-08-05 03:09:14.492092", "None"
    ], [
      "A", 2, 11, "2017-08-22 03:09:14.492096", ""
    ]
  ], 
  "title": "Sample"
}


Or you can save rows as objects (though it is much less compact):

In [14]:
s = df1.to_json(as_dict=True, pretty=True)
print(s)

{
  "columns": [
    "type", "size", "weight", "when", "tel"
  ], 
  "rows": [
    {
      "type": "A", "size": 1, "weight": 12, "when": "2018-02-27 03:09:14.492055", "tel": "+79360360193"
    }, {
      "type": "B", "size": 4, "weight": 12, "when": "2018-01-28 03:09:14.492080", "tel": 84505505151
    }, {
      "type": "A", "size": 3, "weight": 10, "when": "2018-01-13 03:09:14.492085", "tel": "+31415926535"
    }, {
      "type": "B", "size": 6, "weight": 14, "when": "2017-04-03 03:09:14.492088", "tel": 0
    }, {
      "type": "A", "size": 4, "weight": 15, "when": "2017-08-05 03:09:14.492092", "tel": "None"
    }, {
      "type": "A", "size": 2, "weight": 11, "when": "2017-08-22 03:09:14.492096", "tel": ""
    }
  ], 
  "title": "Sample"
}


# Load JSON

You can load a JSON string with one of two mentioned formats (but **title** is optional):

In [15]:
df2 = Table.from_json(s)
df2

'type','size','weight','when','tel'
'A',1,12,'2018-02-27 03:09:14.492055','+79360360193'
'B',4,12,'2018-01-28 03:09:14.492080',84505505151
'A',3,10,'2018-01-13 03:09:14.492085','+31415926535'
'B',6,14,'2017-04-03 03:09:14.492088',0
'A',4,15,'2017-08-05 03:09:14.492092','None'
'A',2,11,'2017-08-22 03:09:14.492096',''


# Save HTML

To save data as an HTML table use **`.html()`** instance method. To turn off default cutting of rows and columns disable *`cut`* optional parameter:

In [16]:
df1.html(cut=False)

"<table><tr><th>'type'</th><th>'size'</th><th>'weight'</th><th>'when'</th><th>'tel'</th></tr><tr><td>'A'</td><td>1</td><td>12</td><td>2018-02-27 03:09:14.492055</td><td>'+79360360193'</td></tr><tr><td>'B'</td><td>4</td><td>12</td><td>2018-01-28 03:09:14.492080</td><td>84505505151</td></tr><tr><td>'A'</td><td>3</td><td>10</td><td>2018-01-13 03:09:14.492085</td><td>'+31415926535'</td></tr><tr><td>'B'</td><td>6</td><td>14</td><td>2017-04-03 03:09:14.492088</td><td>0</td></tr><tr><td>'A'</td><td>4</td><td>15</td><td>2017-08-05 03:09:14.492092</td><td>None</td></tr><tr><td>'A'</td><td>2</td><td>11</td><td>2017-08-22 03:09:14.492096</td><td>''</td></tr></table>"

By the way, Table objects output for these Jupyter Notebooks is implemented using this method.

In [17]:
df1

'type','size','weight','when','tel'
'A',1,12,2018-02-27 03:09:14.492055,'+79360360193'
'B',4,12,2018-01-28 03:09:14.492080,84505505151
'A',3,10,2018-01-13 03:09:14.492085,'+31415926535'
'B',6,14,2017-04-03 03:09:14.492088,0
'A',4,15,2017-08-05 03:09:14.492092,
'A',2,11,2017-08-22 03:09:14.492096,''
