# Density To RDF

In [None]:
include("src/DensityToRDF.jl")

using .DensityToRDF
using Flux
using Plots

To run this program, you first need to configure parameters and prepare your training data. Start by setting up a concentrations array in percentages and an array of paths to corresponding RDF files. These RDF files should contain two columns: the first with distances and the second with the corresponding values.

Next, create a tuple with training parameters, including fields for `epochs` and `learning_rate`. The program uses the `Adam` optimizer by default, so you don't need to specify this.

An important consideration is the length of your RDF files. **All files should have the same number of data points**, as this value determines the structure of your neural network model. For example, if your RDF plots have 300 points, the neural network will have 300 neurons in its output layer.

Remember, the consistency in RDF file length is crucial because it directly influences the model's architecture. This design ensures that the neural network's output corresponds precisely to the input data structure you provide.

### Loading Training and Testing Data

In [None]:
concentrations = [100, 60, 40, 20]
rdf_paths = ["rdf_data/$(c)CH3OH-CG.rdf" for c in concentrations]

test_concentrations = 10:10:100
test_rdf_paths = ["rdf_data/$(c)CH3OH-CG.rdf" for c in test_concentrations]

reference_data = load_reference_data(test_rdf_paths)

validate_data(concentrations, rdf_paths)

data = prepare_training_data(concentrations, rdf_paths)
test_data = prepare_training_data(test_concentrations, test_rdf_paths);

### Creating the Model

In [None]:
output_neurons = length(first(data)[2])
model = create_model(output_neurons)

## Training

In [None]:
params = (
    epochs=10000,
    learning_rate=0.01
);

In [None]:
trained_model, losses_initial = train_model(model, data, params);

In [None]:
plot_loss_values(losses_initial)

In [None]:
save_losses(losses_initial, "losses_initial.txt")

In [None]:
evaluation_results = evaluate_model(trained_model, test_data)
println("Evaluation results:")
for result in evaluation_results
    println("Concentration: $(result.input), MSE: $(result.mse)")
end

In [None]:
plot_evaluation_results(evaluation_results)

## Fine Tuning

Often, a single training session is not sufficient. You typically need to train your model further with different sets of parameters. It often helps to use a smaller learning rate.

**REPEAT THIS BLOCK OF CODE SEVERAL TIMES WITH DECREASING LR FOR BEST RESULTS**

In [None]:
params = (
    epochs=100000,
    learning_rate=0.0001
);

In [None]:
trained_model, losses_fine_tuned = train_model(trained_model, data, params);

In [None]:
plot_loss_values(losses_fine_tuned, false)

In [None]:
save_losses(losses_fine_tuned, "losses_fine_tuned.txt")

In [None]:
evaluation_results_fine_tuned = evaluate_model(trained_model, test_data)
println("Evaluation results:")
for result in evaluation_results_fine_tuned
    println("Concentration: $(result.input), MSE: $(result.mse)")
end

In [None]:
plot_evaluation_results(evaluation_results_fine_tuned)

In [None]:
save_model(trained_model, "output_model.bson")

## Results

In [None]:
plot_results(test_concentrations, reference_data, trained_model)

## Now you can try to predict RDF for any concentration

In [None]:
conc = 42
plot_rdf_model(trained_model([conc]), conc)

## Interpreting of the Model

In [None]:
p = plot(Flux.params(trained_model)[1],
    xlabel="Number",
    ylabel="Values",
    title="VALUES of Output Layer",
    linewidth=2,
    label="Neurons"
)
display(p)

In [None]:
p = plot(Flux.params(trained_model)[2],
    xlabel="Number",
    ylabel="Values",
    title="BIASES of Output Layer",
    linewidth=2,
    label="Biases"
)
display(p)

In [None]:
conc = 0
plot_rdf_model(trained_model([conc]), conc)