# EP1 - Conjunto de Maldelbrot e paralelização com pThreads e OpenMP

| Nome | NUSP |
|------|------|
| Giulia C. de Nardi | 10692203 |
| Vitor D. Tamae | 10705620 |
| Lucy Anne de Omena Evangelista | 11221776 |
| Leonardo Costa Santos | 10783142 |
| Alexandre Muller Jones | 8038149 |


Caso não queira rodar novamente os experimentos, por favor pular para a sessão "Gráficos comparativos".

## Configuração do ambiente

Atualizando os pacotes Julia

In [13]:
] up

[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
[90m [no changes][39m


Verificando o status dos pacotes, e se há algum problema, com o comando:

In [14]:
] st

[32m[1m    Status[22m[39m `~/.julia/environments/v1.3/Project.toml`
 [90m [336ed68f][39m[37m CSV v0.6.2[39m
 [90m [a93c6f00][39m[37m DataFrames v0.21.0[39m
 [90m [31c24e10][39m[37m Distributions v0.23.2[39m
 [90m [7073ff75][39m[37m IJulia v1.21.2[39m
 [90m [b964fa9f][39m[37m LaTeXStrings v1.1.0[39m
 [90m [8314cec4][39m[37m PGFPlotsX v1.2.6[39m
 [90m [1a8c2f83][39m[37m Query v0.12.2[39m
 [90m [f3b207a7][39m[37m StatsPlots v0.14.6[39m


Chamando pacotes que usaremos:

In [15]:
using DataFrames, Query, StatsPlots, Statistics

## Funções para realização dos experimentos

In [None]:
; make mandelbrot_seq

In [None]:
; ./mandelbrot_seq

In [None]:
; ./mandelbrot_seq 0.175 0.375 -0.1 0.1 200 0

A função abaixo recebe parâmetros `size`, com o tamanho da imagem, `f`, com a id do tupo de fractal a ser feito (0 - Full, 2 - Triple Spiral, 3 - Elephant, 4 - Seahorse), `mandel`, com o nome da função a ser executada (`./mandelbrot_seq`, `./mandelbrot_opm`, `./mandelbrot_pth`),e `threads`, com o número de threads do programa paralelo. A função executa o programa `mandelbrot` com os parâmetros dados e devolve um `DataFrame` com os resultados.

In [16]:
function measure_mandelbrot(size, f, mandel; thread = 0)    
    if f == 0  mode = `-2.5 1.5 -2.0 2.0` #full
    elseif f == 1  mode = `-0.188 -0.012 0.554 0.754` #triple spiral
    elseif f == 2  mode = `0.175 0.375 -0.1 0.1` # elephant
    elseif f == 3  mode = `-0.8 -0.7 0.05 0.15` #seahorse
    end
    
    if thread != 0 
    results = parse.(Float64,
        split(chomp(read(`./$mandel $mode $size $thread `, String)), ","))
    else
    results = parse.(Float64,
        split(chomp(read(`./$mandel $mode $size`, String)), ","))
    end
    
    return DataFrame(size = size,
        f = f,
        threads = thread,
        duration = results[1],
        io_alocation = results[2])
end

measure_mandelbrot (generic function with 1 method)

A função `run_experiments` recebe os mesmos parâmetros `size`, `f`,`mandel` e `threads`, e um parâmetro adicional `repetitions`, com o número de repetições de cada experimento com um dado número de `threads`. A função devolve um `DataFrame` com todos os experimentos.

In [5]:
function run_experiments(size, f, mandel, repetitions; threads = [])
    run(`make $mandel`)
        
    results = DataFrame(size = Int[],
        f = Int[],
        threads = Int[],
        duration = Float64[],
        io_alocation = Float64[]) 
    
    if threads != []
    
    for t in threads
        for s in size
            for r in 1:repetitions
            append!(results,
                measure_mandelbrot(s, f, mandel, thread = t))    
            end
        end
    end
        
    else
    
    for r in 1:repetitions
        for s in size
            append!(results,
                measure_mandelbrot(s, f, mandel))    
        end
    end
    
    end
    
    return results
end

run_experiments (generic function with 1 method)

A função `parse_results` recebe um `DataFrame` de resultados, produzido pela função `run_experiments`. A função devolve um `DataFrame` com a média e o intervalo de confiança da média a 95% dos tempos de execução, agrupados por número de threads.

In [17]:
function parse_results(results)
    parsed_results = results |>
                    @groupby({_.threads,_.size}) |>
                    @map({threads = key(_).threads,
                          size = _.size[1],
                          mean_duration = mean(_.duration),
                          mean_io_alocation = mean(_.io_alocation),
                          ci_duration = 1.96 * std(_.duration),
                          ci_io_alocation = 1.96 * std(_.io_alocation)}) |>
                    DataFrame
    
    return parsed_results
end

parse_results (generic function with 1 method)

## Funções para traçar gráficos

A função abaixo permite que sejam traçadas até 5 séries de dados em um mesmo gráfico do tipo scatter.

In [18]:
pgfplotsx()

function plot_results(x, y, series_label, yerror; y2 = [], series_label2 = [], yerror2 = [], 
        y3 = [], series_label3 = [], yerror3 = [], y4 = [], series_label4 = [], yerror4 = [],
        y5 = [], series_label5 = [], yerror5 = [])
    max_thread_power = 5
    
    p = scatter(x, y, xaxis = :log2, xlabel = "Threads", xticks = [2 ^ x for x in 0:max_thread_power],
        yerror = yerror, alpha = 0.6, 
        labels = series_label, legend = :bottomright)
    
    if y2 != []
        p = scatter!(x, y2, xaxis = :log2, xticks = [2 ^ x for x in 0:max_thread_power],
            yerror = yerror2, alpha = 0.6,
            labels = series_label2, legend = :bottomright)
    end
    if y3 != []
        p = scatter!(x, y3, xaxis = :log2, xticks = [2 ^ x for x in 0:max_thread_power],
            yerror = yerror3, alpha = 0.6,
            labels = series_label3, legend = :bottomright)
    end
    if y4 != []
        p = scatter!(x, y4, xaxis = :log2, xticks = [2 ^ x for x in 0:max_thread_power],
            yerror = yerror4, alpha = 0.6,
            labels = series_label4, legend = :bottomright)
    end
    if y5 != []
        p = scatter!(x, y5, xaxis = :log2, xticks = [2 ^ x for x in 0:max_thread_power],
            yerror = yerror5, alpha = 0.6,
            labels = series_label5, legend = :bottomright)
    end
    
    return p
end

plot_results (generic function with 1 method)

## Condições para os experimentos

In [23]:
size = [2 ^ x for x in 4:13]
thread = [2 ^ x for x in 0:5]
repetitions = 10;
#size = [2 ^ x for x in 4:7]
#thread = [2 ^ x for x in 0:3]
#repetitions = 2;

# Gerando e Salvando dados

## Mandelbrot sequencial

Nesta parte, traremos a execução do maldelbrot em sua versão sequencial, junto com a análise de tempo de execução para os diferentes tipo de fractais (Triple Spiral, Elephant, Seahorse & Full) e em diferentes resoluções ($ 2^4 \cdots 2^{13}$)

Realizando as medições para o mandelbrot sequencial:

In [30]:
results_seq_full = run_experiments(size, 0, "mandelbrot_seq", repetitions)
seq_full = parse_results(results_seq_full);

cc     mandelbrot_seq.c   -o mandelbrot_seq


ArgumentError: ArgumentError: Column names :io_alocation were found in only one of the passed data frames and `cols == :setequal`

In [34]:
seq_full

UndefVarError: UndefVarError: seq_full not defined

In [31]:
results_seq_triplespiral = run_experiments(size, 1, "mandelbrot_seq", repetitions)
seq_triplespiral = parse_results(results_seq_triplespiral);

make: 'mandelbrot_seq' is up to date.


ArgumentError: ArgumentError: Column names :io_alocation were found in only one of the passed data frames and `cols == :setequal`

In [32]:
results_seq_elephant = run_experiments(size, 2, "mandelbrot_seq", repetitions)
seq_elephant = parse_results(results_seq_elephant);

make: 'mandelbrot_seq' is up to date.


ArgumentError: ArgumentError: Column names :io_alocation were found in only one of the passed data frames and `cols == :setequal`

In [33]:
results_seq_seahorse = run_experiments(size, 3, "mandelbrot_seq", repetitions)
seq_seahorse = parse_results(results_seq_seahorse);

make: 'mandelbrot_seq' is up to date.


ArgumentError: ArgumentError: Column names :io_alocation were found in only one of the passed data frames and `cols == :setequal`

In [None]:
showall(seq_seahorse)

## Mandelbrot com pthreads

In [24]:
results_pth_full = run_experiments(size, 0, "mandelbrot_pth", repetitions,threads=thread)
pth_full = parse_results(results_pth_full);

make: 'mandelbrot_pth' is up to date.


In [25]:
pth_full

Unnamed: 0_level_0,threads,size,mean_duration,mean_io_alocation,ci_duration,ci_io_alocation
Unnamed: 0_level_1,Int64,Int64,Float64,Float64,Float64,Float64
1,1,16,0.0002328,0.0003282,4.98489e-5,4.84684e-5
2,1,32,0.000477,0.0006425,7.03843e-5,9.10844e-5
3,1,64,0.00152,0.001843,0.000203408,0.000230404
4,1,128,0.0056281,0.0068485,0.000907424,0.00128988
5,1,256,0.0208044,0.0247506,0.000763439,0.00082257
6,1,512,0.0817333,0.0972632,0.00501587,0.00515828
7,1,1024,0.321744,0.386844,0.00513979,0.0124842
8,1,2048,1.28677,1.54528,0.0228244,0.0349549
9,1,4096,5.13044,6.15101,0.0608508,0.0786069
10,1,8192,19.5874,23.3887,3.43214,3.49143


In [26]:
results_pth_triplespiral = run_experiments(size, 1, "mandelbrot_pth", repetitions,threads=thread)
pth_triplespiral = parse_results(results_pth_triplespiral);

make: 'mandelbrot_pth' is up to date.


In [45]:
howall(results_pth_triplespiral)

UndefVarError: UndefVarError: howall not defined

In [27]:
results_pth_elephant = run_experiments(size, 2, "mandelbrot_pth", repetitions,threads=thread)
pth_elephant = parse_results(results_pth_elephant);

make: 'mandelbrot_pth' is up to date.


In [28]:
results_pth_seahorse = run_experiments(size, 3, "mandelbrot_pth", repetitions,threads=thread)
pth_seahorse = parse_results(results_pth_seahorse);

make: 'mandelbrot_pth' is up to date.


ProcessFailedException: failed process: Process(`./mandelbrot_pth -0.8 -0.7 0.05 0.15 8192 1`, ProcessSignaled(2)) [0]


## Mandelbrot com OpenMP

In [36]:
results_omp_full = run_experiments(size, 0, "mandelbrot_omp", repetitions,threads=thread)
omp_full = parse_results(results_omp_full);

make: 'mandelbrot_omp' is up to date.


ProcessFailedException: failed process: Process(`./mandelbrot_omp -2.5 1.5 -2.0 2.0 1024 1`, ProcessSignaled(2)) [0]


In [37]:
results_omp_triplespiral = run_experiments(size, 1, "mandelbrot_omp", repetitions,threads=thread)
omp_triplespiral = parse_results(results_omp_triplespiral);

make: 'mandelbrot_omp' is up to date.


InterruptException: InterruptException:

In [38]:
results_omp_elephant = run_experiments(size, 2, "mandelbrot_omp", repetitions,threads=thread)
omp_elephant = parse_results(results_omp_elephant);

make: 'mandelbrot_omp' is up to date.


InterruptException: InterruptException:

In [39]:
results_omp_seahorse = run_experiments(size, 3, "mandelbrot_omp", repetitions,threads=thread)
omp_seahorse = parse_results(results_omp_seahorse);

make: 'mandelbrot_omp' is up to date.


InterruptException: InterruptException:

In [40]:
showall(omp_seahorse)

UndefVarError: UndefVarError: omp_seahorse not defined

## Salvando dados

In [22]:
using CSV

function save_csv_results(results,filename)
    println(filename)
    CSV.write(filename, results)
end
using CSV

function read_csv_results(filename)
    results=CSV.read(filename)
    return results
end


read_csv_results (generic function with 1 method)

In [35]:
save_csv_results(results_seq_full,"results_data/seq_full.csv")
save_csv_results(results_seq_triplespiral,"data/results_seq_triplespiral.csv")
save_csv_results(results_seq_elephant,"data/results_seq_elephant.csv")
save_csv_results(results_seq_seahorse,"data/results_seq_seahorse.csv");
save_csv_results(seq_full,"data/seq_full.csv")
save_csv_results(seq_triplespiral,"data/seq_triplespiral.csv")
save_csv_results(seq_elephant,"data/seq_elephant.csv")
save_csv_results(seq_seahorse,"data/seq_seahorse.csv");

UndefVarError: UndefVarError: results_seq_full not defined

In [41]:
save_csv_results(results_omp_full,"data/results_omp_full.csv")
save_csv_results(results_omp_triplespiral,"data/results_omp_triplespiral.csv")
save_csv_results(results_omp_elephant,"data/results_omp_elephant.csv")
save_csv_results(results_omp_seahorse,"data/results_omp_seahorse.csv");
save_csv_results(omp_full,"data/omp_full.csv")
save_csv_results(omp_triplespiral,"data/omp_triplespiral.csv")
save_csv_results(omp_elephant,"data/omp_elephant.csv")
save_csv_results(omp_seahorse,"data/omp_seahorse.csv");

UndefVarError: UndefVarError: results_omp_full not defined

In [42]:
save_csv_results(results_pth_full,"data/results_pth_full.csv")
save_csv_results(results_pth_triplespiral,"data/results_pth_triplespiral.csv")
save_csv_results(results_pth_elephant,"data/results_pth_elephant.csv")
save_csv_results(results_pth_seahorse,"data/results_pth_seahorse.csv")
save_csv_results(pth_full,"data/pth_full.csv")
save_csv_results(pth_triplespiral,"data/pth_triplespiral.csv")
save_csv_results(pth_elephant,"data/pth_elephant.csv")
save_csv_results(pth_seahorse,"data/pth_seahorse.csv");

data/results_pth_full.csv
data/results_pth_triplespiral.csv
data/results_pth_elephant.csv


UndefVarError: UndefVarError: results_pth_seahorse not defined

# Gráficos comparativos

Ao final, teremos os dataframes:

|Dataframe | Full | Triple Spiral | Seahorse |
|----------|--------|--------|--------|
|Sequencial|seq_full|seq_triplespiral|seq_seahorse|
|PThreads|pth_full|pth_triplespiral|pth_seahorse|
|OpenMP|omp_full|omp_triplespiral|omp_seahorse|

Carregando os dataframes gerados, para testes futuros:

In [None]:
seq_full=read_csv_results("data/seq_full.csv")
seq_triplespiral=read_csv_results("data/seq_triplespiral.csv")
seq_elephant=read_csv_results("data/seq_elephant.csv")
seq_seahorse=read_csv_results("data/seq_seahorse.csv")
omp_full=read_csv_results("data/omp_full.csv")
omp_triplespiral=read_csv_results("data/omp_triplespiral.csv")
omp_elephant=read_csv_results("data/omp_elephant.csv")
omp_seahorse=read_csv_results("data/omp_seahorse.csv")
pth_full=read_csv_results("data/pth_full.csv")
pth_triplespiral=read_csv_results("data/pth_triplespiral.csv")
pth_elephant=read_csv_results("data/pth_elephant.csv")
pth_seahorse=read_csv_results("data/pth_seahorse.csv");

In [None]:
results_seq_full=read_csv_results("data/results_seq_full.csv")
results_seq_triplespiral=read_csv_results("data/results_seq_triplespiral.csv")
results_seq_elephant=read_csv_results("data/results_seq_elephant.csv")
results_seq_seahorse=read_csv_results("data/results_seq_seahorse.csv")
results_omp_full=read_csv_results("data/results_omp_full.csv")
results_omp_triplespiral=read_csv_results("data/results_omp_triplespiral.csv")
results_omp_elephant=read_csv_results("data/results_omp_elephant.csv")
results_omp_seahorse=read_csv_results("data/results_omp_seahorse.csv")
results_pth_full=read_csv_results("data/results_pth_full.csv")
results_pth_triplespiral=read_csv_results("data/results_pth_triplespiral.csv")
results_pth_elephant=read_csv_results("data/results_pth_elephant.csv")
results_pth_seahorse=read_csv_results("data/results_pth_seahorse.csv");

Realizaremos os gráficos a partir de partições do dataframe, como mostrados abaixo:

In [None]:
filter(row -> row[:threads] == 1, omp_full)

In [None]:
filter(row -> row[:size] == 16, omp_full)

In [None]:
filter(row -> row[:size] == 16, omp_full).mean_duration

Ideias para os gráficos: 
> Comparar desempenho por tamanho da imagem ( 5 grafos > para tamanhos das imagens. Cada serie no grafico deve ser uma forma gerar a imagem)

> Comparar desempenho por tipo de gráfico produzido ( 3 x 4 gráficos, 4 áreas com 3 tipos de calculo cada / cruzar com tamanho da entrada também? daí seriam 4x 3 x 10)

> estou confouzer, me ajudem a saber quais gráficos fazer

### Comparando desempenho por região

#### Sequencial

In [None]:
plot_results(seq_full.threads, seq_full.mean_duration, "Full", seq_full.ci_duration,
    y2 = seq_seahorse.mean_duration, series_label2 = "Seahorse", yerror2 = seq_seahorse.ci_duration,
    y3 = seq_elephant.mean_duration, series_label3 = "Elephant", yerror3 = seq_elephant.ci_duration,
    y4 = seq_triplespiral.mean_duration, series_label4 = "Triple Spiral", yerror4 = seq_triplespiral.ci_duration)

#### OpenMP

In [None]:
plot_results( omp_full.threads, omp_full.mean_duration, "Full", omp_full.ci_duration,
    y2 = omp_seahorse.mean_duration, series_label2 = "Seahorse", yerror2 = omp_seahorse.ci_duration,
    y3 = omp_elephant.mean_duration, series_label3 = "Elephant", yerror3 = omp_elephant.ci_duration,
    y4 = omp_triplespiral.mean_duration, series_label4 = "Triple Spiral", yerror4 = omp_triplespiral.ci_duration)

#### PThreads

In [None]:
plot_results(pth_full.threads, pth_full.mean_duration, "Full", pth_full.ci_duration,
    y2 = pth_seahorse.mean_duration, series_label2 = "Seahorse", yerror2 = pth_seahorse.ci_duration,
    y3 = pth_elephant.mean_duration, series_label3 = "Elephant", yerror3 = pth_elephant.ci_duration,
    y4 = pth_triplespiral.mean_duration, series_label4 = "Triple Spiral", yerror4 = pth_triplespiral.ci_duration)

In [None]:
plot_results(
    filter(row -> row[:size] == 16, omp_ful0l).threads,
    filter(row -> row[:size] == 16, omp_full).mean_duration, "Full", 
    filter(row -> row[:size] == 16, omp_full).ci_duration,
    y2 = filter(row -> row[:size] == 32, omp_full).mean_duration,
    series_label2 = "32", yerror2 = filter(row -> row[:size] == 32, omp_full).ci_duration)

### Comparando desempenho por quantidade de threads

In [None]:
function fazgrafico(filename, label)
    
    plot_results(filename.threads, filename.mean_duration, label, filename.ci_duration)
end   

In [None]:
fazgrafico(pth_full, "pth_full")

In [None]:
fazgrafico(seq_full, "seq_full") #, filename2 = seq_seahorse, label2 = "seq_seahorse")

In [None]:
seq_seahorse

In [None]:
fazgrafico(omp_full, "omp_full")


In [None]:
fazgrafico(omp_triplespiral, "omp_triplespiral")


In [None]:
fazgrafico(omp_elephant, "omp_elephant")


In [None]:
fazgrafico(omp_seahorse, "omp_seahorse")