## Golden dataset generation using Label Studio

At this point we will have in our local file system a folder with the filtered raw data. Now we should start the Label studio interface to extract the test segment for each signal (recording session) in our dataset.

The Label studio interface should be executed from a terminal, not from this notebook. This is because we need later to access to the Label Studio API while the interface is running.

The command that we must run in order to start the server is: 

```
export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/workspaces/EMGStateDetect/10mov4chFU_AFEs/ADS/

export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true

label-studio start
```

Previously we need to install the required label studio packages:

`pip install -q label-studio label-studio-sdk`

In [None]:
# Define the URL where Label Studio is accessible and the API key for your user account
LABEL_STUDIO_URL = 'http://localhost:8080'
# API key is available at the Account & Settings > Access Tokens page in Label Studio UI
API_KEY = 'XXXXXXXX'
# Import the SDK and the client module
from label_studio_sdk.client import LabelStudio
# Connect to the Label Studio API and check the connection
ls_client = LabelStudio(base_url=LABEL_STUDIO_URL, api_key=API_KEY)

In [None]:
response = ls_client.projects.create(
    title = "Test Sets",
    description = "This is a project created using the Label Studio SDK",
    label_config = """
    <View>
        <Header value="Time Series classification"
                style="font-weight: normal"/>
        <TimeSeriesLabels name="label" toName="ts">
            <Label value="SegmentOfInterest"/>
        </TimeSeriesLabels>
        <TimeSeries name="ts" value="$csv" valueType="url">
            <Channel column="ch0" height="40" showAxis="false" fixedScale="false"/>
            <Channel column="ch1" height="40" showAxis="false" fixedScale="false"/>
            <Channel column="ch2" height="40" showAxis="false" fixedScale="false"/>
            <Channel column="ch3" height="40" showAxis="false" fixedScale="false"/>
        </TimeSeries>
    </View>
    """
)

project_id = response.id

In [None]:
# IN case we have to delete a project
ls_client.projects.delete(id=12)

In [None]:
response = ls_client.import_storage.local.create(project=project_id, path='/workspaces/EMGStateDetect/10mov4chFU_AFEs/ADS/filtered/', use_blob_urls=True)
local_storage_id = response.id
ls_client.import_storage.local.sync(id=local_storage_id)

In [None]:
response = ls_client.tasks.list(project=13)

test_set_signal_segments = []

for i in response.items:
    match = re.search(r'Subject_(\d+)/C_(\d+)\.csv$', i.storage_filename)
    print(i.storage_filename)
    subject_number = int(match.group(1))
    class_number = int(match.group(2))
    test_set_signal_segments.append(
        [
            subject_number,
            class_number,
            i.annotations[0]['result'][0]['value']['start'],
            i.annotations[0]['result'][0]['value']['end'],
        ]
    )

np.savetxt(
    '/workspaces/EMGStateDetect/10mov4chFU_AFEs/ADS/golden_set/splits.csv',
    np.array(test_set_signal_segments),
    fmt='%d',
    delimiter=',',
    header='subject,class,start,end',
    comments=''
)

for subject, class_, start, end in test_set_signal_segments:
    tmp_odh = TIADS1299_dataset.raw_odh.isolate_data("subjects", [subject])
    isolated_odh = tmp_odh.isolate_data("classes", [class_])
    golden_split = isolated_odh.data[0][start: end, :]
    os.makedirs(f'/workspaces/EMGStateDetect/10mov4chFU_AFEs/ADS/golden_set/data/', exist_ok=True)
    np.savetxt(
        f'/workspaces/EMGStateDetect/10mov4chFU_AFEs/ADS/golden_set/data/S_{subject}_C_{class_}.csv',
        golden_split
    )
    