# Chapter 4 : Datasources

Dealing with datasources is the most complex part because datasources are versatile :
- Different content formats : plain text, json, ...
- Different protocols UDP, FTP, HTTP, ...
- Request (request the datasource aka client mode) or Being requested (server mode).

## Dive into the Source class

The Source class aims at providing a common interface for all datasources.<p>
As datasources have very little in common, the only assumption made by the framework is :<br>

**Every datasource is iterable**<br>

In practical terms, a Source uses a Python generator.<br>
The *pyngsi.source* package offers many generic sources, and it's easy to create your custom Source by extending the Source class.<p>

A Source iterates over rows.<br>
Rows are composed of two parts :
- the record : the incoming content itself (the payload)
- the content provider : just a string that reminds the origin of the row

Theses 2 points - the iterable sources and the row definition - are the basement of the framework.<br>
This common interface will allow us to create agents that will use our Sources, as we have seen in the previous chapter.

### The Row

In [None]:
from pyngsi.source import Row

help(Row.__init__)

### Sources provided by the framework

In [None]:
import pyngsi.source

print([x for x in dir(pyngsi.source) if x.startswith("Source")])

- Source is the Source base class

- SourceSampleOrion is the Source dedicated to the tutorial

- SourceStdin takes incoming data from standard input

- SourceFile takes incoming data from a local file (supports zip & gzip compression)

- SourceJson takes JSON incoming data, from stdin or from a file (supports zip & gzip compression).<br>
If the incoming JSON is a JSON Array then SourceJson iterates over the array

- SourceIter takes incoming data from any Python Sequence argument (list, tuple, ...) provided to the constuctor

- SourceSingle takes incoming data from the argument provided to the constuctor


## Basic Example

Here the Source takes incoming data from a compressed JSON file.

As the JSON is an array, the Source iterates over each row of the JSON Array.<br>
The provider is filled with the name of the file.

In [None]:
from pyngsi.source import Source

src = Source.create_source_from_file("files/colors.json.gz")

for row in src:
    print(row)

## Our first custom Source

In [None]:
from pyngsi.source import Source

class CustomSource(Source): 
    def __init__(self, rooms): 
        self.rooms = rooms 
        
    def __iter__(self): 
        for record in self.rooms: 
            yield Row("custom", record)

Let's use it

In [None]:
# our CSV lines
rooms = ["Room1;23;720", "Room2;21;711"]

# init the source
src = CustomSource(rooms)

# consume the source and print rows
for row in src:
    print(row)

Your custom Source will rely on the same principles.<br>
The only difference is that you will have to focus on how to acquire your own data.