## Dementia progression estimation using ML based on disease related survey answers

We developed a Machine Learning using **artificial neural network**. The input nodes of the network take the normalized values of the already evaluated answers of the patient to predefined questions, the inner part of the network consists of 3 densely connected hidden layers, each with 128 nodes. Output is one simple node indicating the _estimated progression_ of the disease (or probability of having mild cognitive impairment).

The results can then be used for automated self-assessment by completing a survey, better track of the patients and creating personalized surveys, which can more accurately determine the patient's condition.

In [None]:
import os
import random
from pathlib import Path
import json

import keras.utils
import tensorflow as tf
import tensorflow_ranking as tfr

Extract training data from JSON formatted files (TODO: Requesting data from Web API integration):

The Data is formated like this: {"ans": [12, 53, 46, ..., 99], "res": 35}. The array contains the scores to every question in percents and the "res" is the overall evaluation value of how much the disease has progressed, also in percents.

In [None]:
train_data = []
train_res = []

for f in os.listdir('data/train'):
    data = json.loads(open('data/train/' + f).read())
    train_data.append([x/100 for x in data['ans']])
    ls = [data['res']/100]
    train_res.append(ls)

Building the AI model using Tensorflow Keras library:

In [None]:
n = len(train_data[0])  # number of questions in the survey

model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_dim=n),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(1, activation='softmax')
])

We then compile the model and start training it with the data:

In [None]:
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_data, train_res, epochs=300)

When the model is well trained with classified data, we are ready for testing using the answers to arbitrary set of questions:

The result in the end is finally converted in percents and represents the predicted severity of dementia

In [None]:
test_data = []

for f in os.listdir('data/test'):
    test_data.append([x/100 for x in json.loads(open('data/test/' + f).read())])


for ls in model.predict(test_data):
    print(ls[0]*100)