# Generate an input schema for Fletchgen

We first import pyarrow:

In [1]:
import pyarrow as pa

Now we can start constructing the schema.
For this simple example, the schema will describe the types of a very simple table.
The table will only contain a single column with some numbers, called "num".

In [2]:
# Create a new field named "number" of type int64 that is not nullable.
number_field = pa.field('number', pa.int64(), nullable=False)

# Create a list of fields for pa.schema()
schema_fields = [number_field]

# Create a new schema from the fields.
schema = pa.schema(schema_fields)

Fletchgen would be able to process this schema already.
However, we will take a look at how we can pass some additional options to Fletchgen to make sure it generates the type of hardware infrastructure that we want.

### Schema mode
One important option is the access mode of the RecordBatch that is described by this schema. The access mode can be either ...
* ```'read'```: when you want the FPGA kernel to "read" from the RecordBatch in memory, or 
* ```'write'```: when you want the FPGA kernel to "write" to the RecordBatch in memory.

... and is to be set in the metadata of the Arrow schema with the key `'fletcher_mode'`.

Note that in the programming model of Arrow, RecordBatches are immutable once constructed. Therefore, a schema mode cannot be both `'read'` and `'write'` at the same time.

### Schema name
As Fletchgen can create kernels that operate on multiple input and/or output RecordBatches, we need a way of telling which RecordBatch is which, in case they have fields of the same name. Therefore, we must name each input schema using the metadata key 'fletcher_name'.

We'll now go ahead and define that we'd like to read from the RecordBatch that this Schema can describe, and that its name should be 'ExampleBatch'.

In [3]:
# Construct some metadata to explain Fletchgen that it 
# should allow the FPGA kernel to read from this schema.
metadata = {b'fletcher_mode': b'read',
            b'fletcher_name': b'ExampleBatch'}

# Add the metadata to the schema
schema = schema.add_metadata(metadata)

# Show the schema
print(schema)

number: int64 not null
metadata
--------
OrderedDict([(b'fletcher_mode', b'read'), (b'fletcher_name', b'ExampleBatch')])


We have now created a schema and added the appropriate metadata for Fletchgen to do it's job. All we have to do now is save it to a file so we can pass it to Fletchgen.

In [4]:
# Serialize the schema itself into an Arrow buffer.
serialized_schema = schema.serialize()

# Write the buffer to a file output stream.
pa.output_stream('schema.as').write(serialized_schema);

In your project folder, you should now find a file that contains the schema. We will use this file as input for Fletchgen.

[Return to the Sum tutorial](../README.md#generate-a-recordbatch)