# Data sources

Stores can be fed from several sources :

- [CSV files](#CSV)
- [Parquet files](#Parquet)
- [Pandas dataframes](#Pandas)
- [Spark dataframes](#Spark)

## CSV

In [None]:
import atoti as tt

session = tt.create_session()

In [None]:
csv_store = session.read_csv("data/example.csv", keys=["ID"], store_name="First")
csv_store.head()

## Parquet

[Apache Parquet](https://parquet.apache.org/) is a columnar storage format. Those files can be used as a source :

In [None]:
parquet_store = session.read_parquet("data/example.parquet", keys=["ProductId"])
parquet_store.head()

## Pandas

_pandas_ is an open source library providing easy-to-use data structures and data analysis tools.
For more details about how to use _pandas_ you can refer to its [cookbook](https://pandas.pydata.org/pandas-docs/stable/user_guide/cookbook.html).

Its DataFrame can be used as a source to feed a store:

In [None]:
import pandas as pd

dataframe = pd.read_csv("data/example.csv")
pandas_store = session.read_pandas(dataframe, "Second", keys=["ID"])
pandas_store.head()

## Spark

[Apache Spark](https://spark.apache.org/) is a unified analytics engine for large-scale data processing.

Its DataFrame can be used as a source to feed a store:

In [None]:
from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Demo").getOrCreate()

In [None]:
spark_df = spark.read.csv("data/example.csv", header=True, inferSchema=True)
spark_df.show()

In [None]:
spark_store = session.read_spark(spark_df, "Spark", keys=["ID"])
spark_store.head()