# Read Data and Generate the Schema

Here, we will cover how to load data and use inferred statistics in Draco.

## Available functions

The main functions allow you to get the schema from a Pandas dataframe or a file. These functions return a schema as a dictionary, which you can encode as Answer Set Programming facts using our generic `dict_to_facts` encoder.


```{eval-rst}
.. autofunction:: draco.schema.schema_from_dataframe
.. autofunction:: draco.schema.schema_from_file
```

## Usage Example

In [4]:
import sys
sys.executable

'/home/shreya/.cache/pypoetry/virtualenvs/draco-j8SBOHbm-py3.8/bin/python'

In [5]:
from draco import schema_from_dataframe, dict_to_facts

In this example, we use a weather dataset from Vega datasets but this could be any Pandas dataframe.

In [6]:
from vega_datasets import data
df = data.seattle_weather()

We can then call `schema_from_dataframe` to get schema information from the pandas dataframe. The schema information is a dictionary.

In [7]:
schema = schema_from_dataframe(df)
schema

{'number_rows': 1461,
 'field': [{'name': 'date', 'type': 'datetime', 'unique': 1461},
  {'name': 'precipitation',
   'type': 'number',
   'unique': 111,
   'min': 0,
   'max': 55,
   'std': 6},
  {'name': 'temp_max',
   'type': 'number',
   'unique': 67,
   'min': -1,
   'max': 35,
   'std': 7},
  {'name': 'temp_min',
   'type': 'number',
   'unique': 55,
   'min': -7,
   'max': 18,
   'std': 5},
  {'name': 'wind',
   'type': 'number',
   'unique': 79,
   'min': 0,
   'max': 9,
   'std': 1},
  {'name': 'weather', 'type': 'string', 'unique': 5, 'freq': 714}]}

We can then convert the schema dictionary into facts that Dracos constraint solver can use with `dict_to_facts`. The function returns a list of facts. The solver will be able to parse these facts and consider them in the recommendation process.

In [20]:
dict_to_facts(schema)

['attribute(number_rows,root,1461).',
 'property(field,root,0).',
 'attribute((field,name),0,date).',
 'attribute((field,type),0,datetime).',
 'attribute((field,unique),0,1461).',
 'property(field,root,1).',
 'attribute((field,name),1,precipitation).',
 'attribute((field,type),1,number).',
 'attribute((field,unique),1,111).',
 'attribute((field,min),1,0).',
 'attribute((field,max),1,55).',
 'attribute((field,std),1,6).',
 'property(field,root,2).',
 'attribute((field,name),2,temp_max).',
 'attribute((field,type),2,number).',
 'attribute((field,unique),2,67).',
 'attribute((field,min),2,-1).',
 'attribute((field,max),2,35).',
 'attribute((field,std),2,7).',
 'property(field,root,3).',
 'attribute((field,name),3,temp_min).',
 'attribute((field,type),3,number).',
 'attribute((field,unique),3,55).',
 'attribute((field,min),3,-7).',
 'attribute((field,max),3,18).',
 'attribute((field,std),3,5).',
 'property(field,root,4).',
 'attribute((field,name),4,wind).',
 'attribute((field,type),4,numb

This list of facts can direct be passed into Draco's constraint solver - the Clingo Solver

# Run the Clingo Solver

The facts produced above act as constraints to the Clingo Solver. `run_clingo` calls the Clingo Solver on the program passed in and returns a list of models.

In [21]:
from draco import run_clingo

In [22]:
program = dict_to_facts(schema)
models = run_clingo(program)

`models` stores a list of models that satisfy the constraints passed to `run_clingo`

# Converting Satisfiable Models back into Views
In order produce visualizations that satisfy the constraints, we must convert the answer sets of each model back to a nested data structure that resembles the schema. Models are objects of class `Model` which has members `answer_set` of type `clingo.Symbol`, `cost` of type of `int`, and `number` of type `int` 

In [23]:
from draco import answer_set_to_dict

In [24]:
views = [answer_set_to_dict(model.answer_set) for model in models]
views


[{'number_rows': 1461,
  'field': [{'name': 'date', 'type': 'datetime', 'unique': 1461},
   {'name': 'precipitation',
    'type': 'number',
    'unique': 111,
    'min': 0,
    'max': 55,
    'std': 6},
   {'name': 'temp_max',
    'type': 'number',
    'unique': 67,
    'min': -1,
    'max': 35,
    'std': 7},
   {'name': 'temp_min',
    'type': 'number',
    'unique': 55,
    'min': -7,
    'max': 18,
    'std': 5},
   {'name': 'wind',
    'type': 'number',
    'unique': 79,
    'min': 0,
    'max': 9,
    'std': 1},
   {'name': 'weather', 'type': 'string', 'unique': 5, 'freq': 714}]}]

In the example above, there is one satisfiable model which produces one nested data structure. This structure can be fed into visualization tools such as VegaLite to produce views. 