# Vantage6

In [None]:
import time
import matplotlib.pyplot as plt
from vantage6.client import Client
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import ConfusionMatrixDisplay
from src import config

## Reseacher role

We are now _sharing_ the OPC-Radiomics data in vantage6 data nodes and, as a researcher in a collaboration, you can send tasks to them. We will send a logistic regression task that does exactlly the same as we did in the previous tutorial. The difference is that all the steps are now automated.

Before starting, you need to edit the `config.py` file, which is located in the `src` directory. The information you need to provide in this file will be shared with you during the lecture.

### Training a model

In [None]:
# Initialize the vantage6 client object, and run the authentication
client = Client(
    config.server_url, config.server_port, config.server_api,
    verbose=True
)
client.authenticate(config.username, config.password)
client.setup_encryption(None)

In [None]:
# EXERCISE
# Vantage6 logistic regression task input, set the maximum iterations to a high value and delta to a low value
input_ = {
    'method': 'master',
    'master': True,
    'kwargs': {
        'org_ids': [2, 3],
        'predictors': <YOUR_ANSWER_HERE>,
        'outcome': <YOUR_ANSWER_HERE>,
        'classes': <YOUR_ANSWER_HERE>,
        'max_iter': <YOUR_ANSWER_HERE>,
        'delta': <YOUR_ANSWER_HERE>,
    }
}

In [None]:
# Vantage6 logistic regression task creation
task = client.task.create(
    collaboration=1,
    organizations=[2, 3],
    name='v6-logistic-regression-py',
    image='ghcr.io/maastrichtu-cds/v6-logistic-regression-py:latest',
    description='logistic regression',
    input=input_,
    data_format='json'
)

In [None]:
%time
# Retrieving results
task_info = client.task.get(task['id'], include_results=True)
while not task_info.get('complete'):
    print('Waiting for results...')
    task_info = client.task.get(task['id'], include_results=True)
    time.sleep(1)
result_info = client.result.list(task=task_info['id'])
results = result_info['data'][0]['result']

In [None]:
# EXERCISE
# Explore the results dictionary and create a variable named `model` with the model received


- How do the result compare with the one you did manually?

### Validation

Let's now run model validation in the third node.

In [None]:
# EXERCISE
# Input for task that will run model validation
input_ = {
    'method': 'run_validation',
    'master': False,
    'kwargs': {
        'parameters': [model.intercept_.tolist(), model.coef_.tolist()],
        'classes': <YOUR_ANSWER_HERE>,
        'predictors': <YOUR_ANSWER_HERE>,
        'outcome': <YOUR_ANSWER_HERE>,
    }
}

In [None]:
# EXERCISE
# Vantage6 logistic regression validation task creation, you should send the task to organization 4 and use the same image as before


In [None]:
# EXERCISE
# Retrieve results


In [None]:
# EXERCISE
# Explore the results dictionary


In [None]:
# EXERCISE
# Visualise confusion matrix


## References

- [Vantage6 documentation](https://docs.vantage6.ai/en/main/)
- [Vantage6 logistic regression algorithm](https://github.com/MaastrichtU-CDS/v6-logistic-regression-py)