# Modifying the source input with a key:value map

In this notebook, we will be looking at one of the properties that the input processors have: They can transform, with a simple map, the names of the fields for the loaded metadata.

Why? Why not just have the input file match the expected values?

Well, sometimes you **will** have to do that. But let's imagine that your pipeline is producing an almost-ready input, but your laboratory, instead of calling their samples by `name`, uses another identifier, such as `sample_id`. You want to automatise sending the metadata when is ready by the pipeline, but you don't want to write another script. Easy then! Let's see how to do that:

In [2]:
## Import everything we need
from biobroker.input_processor import TsvInputProcessor # An input processor

sample_tsv = [
    ["sample_id", "collected_at"],
    ["sumple", "noon"]         
]

writable_sample = "\n".join(["\t".join(row) for row in sample_tsv])
with open("simple_sample_sumple.tsv", "w") as f:
    f.write(writable_sample)

path = "simple_sample_sumple.tsv" # This is the file we created previously

## Set up the required entities

input_processor = TsvInputProcessor(input_data=path)

print(input_processor.input_data)

[{'sample_id': 'sumple', 'collected_at': 'noon'}]


Up to here, everything is the same: you have set up the input processor pointing to the data.

Here comes the slightly different part: Let's transform the metadata so that "sample_id" becomes "name":

In [3]:
map_of_fields = {
    "sample_id": "name"
}

input_processor.transform(field_mapping=map_of_fields, delete_non_mapped_fields=False)

Let's take a look at the metadata now!

In [4]:
print(input_processor.input_data)

[{'collected_at': 'noon', 'name': 'sumple'}]


ta-da! We now have the samples in the format that we want and we can `process` and `submit` them without any issue.

While not a super complicated transformation, this can help setting up your own pipelines without the need to tailor the metadata in your pipeline's output.