# Package overview tour

This tutorial describes what it can do and how to use it. This does not cover all but shows the most interesting features.

Let's read the core I/O interface class: eXtended-I/O

In [1]:
from brane import ExtendedIO as xio

## Read / Write operations via eXtended-I/O

### text

First of all, let's write the following text as a normal text:

In [2]:
text = """
name,role,birthyear
Alice,sender,1978
Bob,receiver,1978
Carol,,1984
Eve,eavesdropper,1988
Mallory,attacker,2003
Walter,warden,
Ivan,issuer,2002
""".strip()

It's very simple. Specify the string and the path.

In [3]:
xio.write(text, "./text.txt")

And you can see that this is written as just an ordinary text.

In [4]:
!cat ./text

name,role,birthyear
Alice,sender,1978
Bob,receiver,1978
Carol,,1984
Eve,eavesdropper,1988
Mallory,attacker,2003
Walter,warden,
Ivan,issuer,2002

In [5]:
text_reload = xio.read("./text.txt")
assert type(text_reload) == str
print(text_reload)

name,role,birthyear
Alice,sender,1978
Bob,receiver,1978
Carol,,1984
Eve,eavesdropper,1988
Mallory,attacker,2003
Walter,warden,
Ivan,issuer,2002


In [6]:
text == text_reload

True

The reloaded one is string and same as the orginal one.

Next, you save this text as csv file. The extension is just the symbol of the format but meaningless to the filesystem.

In [7]:
xio.write(text, "./actor.csv", module_name="TextIO")   # [Note]: replaced by 'textio' in the future

### csv

Now try reading the csv file. It's just a text with a csv extension in the filename.

In [8]:
df = xio.read("actor.csv")
print(type(df))
df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,name,role,birthyear
0,Alice,sender,1978.0
1,Bob,receiver,1978.0
2,Carol,,1984.0
3,Eve,eavesdropper,1988.0
4,Mallory,attacker,2003.0
5,Walter,warden,
6,Ivan,issuer,2002.0


In this turn, you find that the loaded object is not a text but pandas DataFrame.
If you would like to use the bultin's csv package in some reason, you can do it by passing the module_name as 'csv'.

In [9]:
df = xio.read("actor.csv", module_name="Csv")  # [Note]: replaced by 'csv' in the future
print(type(df))
df

<class 'list'>


[['name', 'role', 'birthyear'],
 ['Alice', 'sender', '1978'],
 ['Bob', 'receiver', '1978'],
 ['Carol', '', '1984'],
 ['Eve', 'eavesdropper', '1988'],
 ['Mallory', 'attacker', '2003'],
 ['Walter', 'warden', ''],
 ['Ivan', 'issuer', '2002']]

Again, you can spcify the pandas module for module_name to ensure. It's better to add type annotation if you know the type of the loaded object.

In [10]:
import pandas as pd
df: pd.DataFrame = xio.read("actor.csv", module_name="Pandas")  # [Note]: replaced by 'pandas' in the future
print(type(df))
df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,name,role,birthyear
0,Alice,sender,1978.0
1,Bob,receiver,1978.0
2,Carol,,1984.0
3,Eve,eavesdropper,1988.0
4,Mallory,attacker,2003.0
5,Walter,warden,
6,Ivan,issuer,2002.0


Let's save the pandas dataframe. As you may know, this will save the index too:

In [11]:
xio.write(df, "actor_w_index.csv")
!cat ./actor_w_index.csv

name,role,birthyear
Alice,sender,1978.0
Bob,receiver,1978.0
Carol,,1984.0
Eve,eavesdropper,1988.0
Mallory,attacker,2003.0
Walter,warden,
Ivan,issuer,2002.0


Wow, no index appears. This is because some most common parameters are set in advance.
For pandas, `index=None` is such a parameter.

Of course, you can specify the index too.

In [12]:
xio.write(obj=df, path="actor_w_name_index.csv", index=True)
!cat ./actor_w_name_index.csv

,name,role,birthyear
0,Alice,sender,1978.0
1,Bob,receiver,1978.0
2,Carol,,1984.0
3,Eve,eavesdropper,1988.0
4,Mallory,attacker,2003.0
5,Walter,warden,
6,Ivan,issuer,2002.0


Here, for clarity, we put the keyword arguments like obj and path,

## Customization

* consider new extension .hello and a corresponding module Hello
* it treats the Python dictionary

from brane import Module, Format, Object

### define own module

For that purpose, let's define our own format 'hello' with its classs implementation as follows.

* header (1st line): a symbol which plays role as a separator in the following lines
* body (any other sequent line): each line consisting of key and value with the separator specified in the header.

In [13]:
#from brane.typing import *
from __future__ import annotations
from typing import Union

class HelloClass():
    def __init__(self, mapper: dict[str, Union[str, int]]):
        self.mapper = mapper

    def __repr__(self):
        return self.mapper.__repr__()

class HelloIO():
    @staticmethod
    def load(path: str):
        with open(path, "r") as f:
            file: str = f.read()
        sep, *lines = file.split("\n")
        data: dict = {}
        for l in lines:
            k, v = l.split(sep)
            data[k] = v
        return HelloClass(mapper=data)

    @staticmethod
    def dump(mapper: HelloClass, path: str, sep: str = ": "):
        with open(path, "w") as f:
            f.write(f"{sep}")
            for key, value in obj.mapper.items():
                f.write(f"\n{key}{sep}{value}")

Now, test it:

In [14]:
obj = HelloClass(mapper={
    "Jan": 1,
    "Feb": 2,
    "Mar": 3,
})

Save it.

In [15]:
HelloIO.dump(mapper=obj, path="./test.hello")

In [16]:
!cat ./test.hello

: 
Jan: 1
Feb: 2
Mar: 3

Load it.

In [17]:
HelloIO.load(path="./test.hello")

{'Jan': '1', 'Feb': '2', 'Mar': '3'}

Good.

### define new Module subclass

In [18]:
from brane.core import Module

class HelloModule(Module):
    name = "hello"  # ID of this Module class
    loaded = True  # If you define the read/write classmethod directly in th Module subclass, you must set this.

    @classmethod
    def read(cls, path: str, *args, **kwargs):
        return HelloIO.load(path=path)

    @classmethod
    def write(cls, obj, path, *args, **kwargs):
        return HelloIO.dump(mapper=obj, path=path)

You must define three attributes at this class:

* name (propetry): This is the ID of the new module.
* read (classmethod): This defines the reading/loading process with the keyword arguments path at least.
* write (classmethod): This defines the writing/saving process with the keyword arguments obj & path at least.

Test it again.

In [19]:
HelloModule.read(obj=obj, path="./test.hello")

{'Jan': '1', 'Feb': '2', 'Mar': '3'}

In [20]:
HelloModule.write(obj=obj, path="./test2.hello")

In [21]:
!cat ./test2.hello

: 
Jan: 1
Feb: 2
Mar: 3

No problem at all.

### define new Format subclass

Next, we must implement another class called `Format` which connect the above reading/writing module and the extension.

In [22]:
from brane.core import Format

class HelloFormat(Format):
    name = "hello"  # ID of this Format class
    module = HelloModule
    # module_name = HelloModule.name  # equivalent to "hello" in our case
    default_extension = "hello"  # the extension in the path

* name (propetry): This is the ID of the new format.
* module (property): Assign the correspnding `Module` subclass.
* default_extension (property): The extension name.

In [23]:
xio.read("./test.hello")

{'Jan': '1', 'Feb': '2', 'Mar': '3'}

OK, great work ! Now, the brane I/O choose the correct module based on the extension.

### define new Object subclass

Finally, let's save our Hello object in our Hello format i.e. with the `.hello` extension.

In [24]:
from brane.core import Object

class HelloObject(Object):
    format = HelloFormat
    module = HelloModule
    object_type = HelloClass

In our case, it's still simple:

* module (property): The corresponding `Module` subclass.
* format (property): The corresponding `Format` subclass.
* object_type (property): The type of the target objects, here, `HelloClass`.

In [25]:
xio.write(obj=obj, path="./auto.hello")

In [26]:
!cat ./auto.hello

: 
Jan: 1
Feb: 2
Mar: 3

Now, we've learned the basic definition & registration of our own I/O to the eXtend-I/O.

## Hook system

coming soon...