# Example: randomized complete block design (RCBD) with Sudoku

First, import all required modules.

In [None]:
using SudokuPlantDesign
using DataFrames
using XLSX
using PyPlot

In contrast to the augmented design, a randomized complete block designs (RCBD) only contains replicated varieties (previously called *checks*) and no unreplicated ones (previously called *entries*). Therefore, to utilize the Sudoku optimizer for such RCBDs, the entries of the RCBD are placed on the positions of checks in the Sudoku design which themselves make up the entire field in such a way that each block has exactly one genotype of each type.

## 1) Generate (optimized) Sudoku configuration

We start by generating a new configuration `conf` which is divided into `3` horizontal and `1` vertical block of dimensions `3` x `5` respectively. The total design therefor contains 45 plots.

In our example, there are `15` different genotypes to replicate which are now entering as checks into the Sudoku configuration. These are initialized randomly with one check per type per block. This already yields a promising starting configuration, however further optimization is required to maximize distance between the same genotypes.

In [None]:
conf = get_configuration([3,3,3],[5], 15)

initialize_checks_per_block!(conf)

show_configuration(conf, title_zoom=0.6)
mkpath("output/")
savefig("output/RCBD_configuration_initial.pdf")

The cost function and updates in this approach resemble the choice familiar from the example of the augmented design. Here however, there are no costs associated when checks are unequal in type (`K_num_checks_equal_per_type`). This is due to two observations: First, the checks are initialized equally and second, we only choose updates which swap checks but do not provide new labels.

In [None]:
cost_function(c) =  K_checks_per_type_per_block(c, 1)*20 +
                    K_neighbors_different_check_functional(c, d->0.5/(d^3)) +
                    K_neighbors_same_check_functional(c, d->1/(d^3))

In [None]:
updates = [UpdateSwapCheckCheck()]

Then, the optimization is run,

In [None]:
optimize_design!(
    conf,
    updates,
    cost_function,
    500000
);

and the resulting configuration is visualized.

In [None]:
show_configuration(conf, title_zoom=0.6)
mkpath("output/")
savefig("output/RCBD_configuration_final.pdf")

## 2) Save design data with field plan

With an optimized configuration `conf` at hand, one can proceed to create a field plan for the design. For such a field plan, additional information on the genotypes involved in the trial is added. Here, this information enters as two dataframes with data for checks and entries (of the Sudoku design) respectively. As in the case for the augmented design, they have to be of the following structure:
- first colum: genotype name
- further colums: additional information (optional, but have to be identical among the dataframes)

Here, we *only* have checks in the Sudoku design (which are the replicated entries of the RCBD), meaning we can leave one of the two dataframes empty. For the data of checks, the sheet `genotypes` from the Excel file `input_RCBD.xlsx` is read in and converted into the a dataframe.

In [None]:
entrydata = DataFrame();

In [None]:
checkdata = string.(DataFrame(XLSX.readtable("input_RCBD.xlsx", "genotypes")));
replace!.(eachcol(checkdata), "missing" => "NA");

The field plan is then based on an upgraded version of the configuration `conf`, a so-called *labeled check configuration* with the name `lconf`. This labeled configuration contains not only the original configuration, but also indices (position ID) and labels of each plant.

Below, indices are set in a snake pattern along the x-direction (changing direction as it traverses the block boundaries) and labels are filled from the previously created dataframes. Then, the labeled configuration is visualized.

In [None]:
lconf = LabeledCheckConfiguration(conf)

fill_indices_snake_x!(lconf, 1,1, index_for_empty=false, max_i=3)
fill_indices_snake_x!(lconf, 1,-1, index_for_empty=false, min_i=4, max_i=6, start_index=16)
fill_indices_snake_x!(lconf, 1,1, index_for_empty=false, min_i=7, start_index=31)
fill_labels!(lconf, checkdata, entrydata)

show_configuration(lconf, check_labels=false, show_coordinates=true, text_zoom=0.8, title_zoom=0.5)
mkpath("output/")
savefig("output/RCBD_final_design.pdf")

For exporting into a trial plan, the data of this optimized Sudoku-augmented design can now be converted back into a dataframe. This dataframe contains not only the genotype name and properties, but also their individual positions, xy-locations as well as information about their block. This dataframe can be further modified in julia before exporting it.

In this example, all generic property columns are renamed to the the column names of the checkdata file and two additional columns are added to the dataframe.

In [None]:
df = get_dataframe(lconf)

for (i,name) in enumerate(names(checkdata)[2:end])
    rename!(df,Symbol("property_"*string(i)) => Symbol(name))
end

df[:, :year]       .= 2023
df[:, :extra_info] .= "myextrainfo"

Finally, the trial plan is created by writing the dataframe into an Excel file.

In [None]:
mkpath("output/")
XLSX.writetable("output/RCBD_final_design.xlsx", collect(eachcol(df)), names(df),overwrite=true)