# Changing datafile using Tags

## Table of Contents (TOC) <a class="anchor" id="toc"></a>

- [1. Imports](#first-bullet)
- [2. Zegami Client, Workspace, and Collection](#second-bullet)
- [3. Chaging Datafile Value of One Item](#third-bullet)
    - [3.1. Creating a Tag](#creating-tag)
    - [3.2. Converting Fahrenheit values to Celsius](#fah_to_cel)

---

## 1. Imports <a class="anchor" id="first-bullet"></a> <span style="font-size:0.5em;">[(Back to TOC)](#toc)</span>

First we'll start by importing all of the needed libraries.

In [1]:
from zegami_sdk.client import ZegamiClient

# 2. Zegami client, workspace, and collection <a class="anchor" id="second-bullet"></a> <span style="font-size:0.5em;">[(Back to TOC)](#toc)</span>

Here we'll initialize the Zegami client so that we can have access to our collections.

If you haven't intialized your client before, you'll to have to provide your username and password, in order to create a security token. You only have to do this once. After that, you can initialize the client without providing login details, as the security token will be used instead!

In [2]:
zc = ZegamiClient()

Used token from '/home/martim-zegami/zegami_com.zegami.token'.
Client initialized successfully, welcome .



After initializing the client, we can retrieve our workspace and collection.

In [3]:
workspaces_lst = zc.workspaces
print(workspaces_lst)

[<Workspace id=GUDe4kRY name=Martim Chaves>]


In [4]:
# Get workspace using the ID
WORKSPACE_ID = zc.workspaces[0].id # id=GUDe4kRY
workspace = zc.get_workspace_by_id(WORKSPACE_ID)

In [5]:
workspace.show_collections()


Collections in 'Martim Chaves' (3):
6271698f81e4bccb640d6e24 : Flags of the world
62792f0488e4d3d7181d8256 : nations_of_the_world
627e12a31bdd62bb88d6afb8 : X-ray-analysis


In [6]:
# Let's get the X-ray-analysis collection, which is the one we're currently working with
collection = workspace.get_collection_by_name('X-ray-analysis')
print(collection)

<CollectionV1 id=627e12a31bdd62bb88d6afb8 name=X-ray-analysis>


## 3. Chaging datafile value of an item using Tags <a class="anchor" id="third-bullet"></a> <span style="font-size:0.5em;">[(Back to TOC)](#toc)</span>

When we were investigating the data using the Zegami platform, we noticed that one datapoint didn't have the temperature data in the right unit (Fahrenheit, instead of Celsius). This was causing an issue where the temperature axis didn't have an appropriate scale, as there was an anomaly. We can change the values in the datafile using the SDK!

<img src="./images/weird_axis.png" width="1000"/>

### 3.1. Creating a Tag <a class="anchor" id="creating-tag"></a> <span style="font-size:0.5em;">[(Back to TOC)](#toc)</span>

In this case, we're only looking at one sample, so we could manually change the value in the datafile and reupload it. But, for the sake of demonstrating something that is more scalable, let's imagine that there are several samples that have this issue. Using the platform, we can easily create a **Tag** associated with these samples, using the selection tool.

<img src="./images/tag_wrong_unit.png" width="1000"/>

We can now change the values of the samples belonging to that Tag.

### 3.2. Converting Fahrenheit values to Celsius <a class="anchor" id="fah_to_cel"></a> <span style="font-size:0.5em;">[(Back to TOC)](#toc)</span>

First, we'll start by creating the function that we'll use to modify the values. Then, we'll get a copy of the dataframe containing the samples (rows) belonging to the Tag that we created. We'll alter that copy, and then we'll use it to update the datafile.

In [7]:
def fahrenheit_to_celsius(temp: float) -> float:
    temp -= 32
    temp *= 5/9.
    return temp

In [22]:
collec_temp_unit_tag = collection.get_rows_by_tags(['wrong_temp_unit']).copy()

In [25]:
collec_temp_unit_tag['temperature'] = collec_temp_unit_tag['temperature'].apply(fahrenheit_to_celsius)

In [27]:
collection.rows.update(collec_temp_unit_tag)

In [33]:
collection.replace_data(collection.rows)

When doing these sorts of operations (uploading data), we should check collection.status. Changes may take some time.

In [34]:
collection.status

{'changed_at': 'Mon, 16 May 2022 19:13:47 GMT',
 'progress': 1.0,
 'status': 'completed'}

In [35]:
collection.get_rows_by_tags(['wrong_temp_unit']).temperature

108    35.0
Name: temperature, dtype: float64

Now that the value has been updated, let's check Zegami's platform!

<img src="./images/fixed_temp_axis.png" width="1000"/>

Much better!