# Importing ODV aggregated spreadsheet data

* Data are supposed to be aggregated using ODV and exported as aggregated ODV spreadsheet files
* Substitute the file name `small_ODV_sample.txt` with the file name of your aggregated ODV file. 
* Do not export "data error"  from ODV (column header `STANDARD_DEV`)

In [None]:
using DIVAnd
using PyPlot
if VERSION >= v"0.7.0-beta.0"
    using Dates
    using Statistics
    using DelimitedFiles
else
    using Compat: @info, @warn, @debug
end
using Compat

In [None]:
#download("...","data/small_ODV_sample.txt")

Aggregated ODV files do not have a semantic header, therefore we need to extract the corresponding column by using the "local" column header name (instead of being able to use the P01 name for the ODV files conforming to 
[Specification of SeaDataNet Data Transport Formats](https://www.seadatanet.org/content/download/636/3333/file/SDN2_D85_WP8_Datafile_formats.pdf?version=2)).


By default only `good` and `probably good` values are loaded.     
This can be changed using the optional parameter `qv_flags`:

In [None]:
DIVAnd.ODVspreadsheet.GOOD_VALUE

In [None]:
?ODVspreadsheet.load

If for some reasons, the column name contains an underscore (`Water_body_phosphate` as opposed to `Water body phosphate`), then the local name should also use underscoes.

In [None]:
dataname = "Water body salinity"

obsval,obslon,obslat,obsdepth,obstime,obsid = ODVspreadsheet.load(Float64,["data/small_ODV_sample.txt"],
                           ["Water body salinity"]; nametype = :localname );

Basic range check for the data (and presence of NaN and Inf)

In [None]:
checkobs((obslon,obslat,obsdepth,obstime),obsval,obsid)

Individual elements can be retrieved by indexing     
`obsdata`, `obslat`, `obslon`, `obsdepth` and `obstime`,      
for example:

In [None]:
obsval[10]

## Remove data from the file
Generate a text file to keep track of the removed data.        
Define the indices of the data to delete:

In [None]:
index = [10,14]

Create an array containing these data:

In [None]:
baddata = ["lon" "lat" "depth" "time" "value" "ids";
    obslon[index]  obslat[index] obsdepth[index] obstime[index] obsval[index] obsid[index]]

The array will be written to a file as a text using the function `writedlm`.

In [None]:
?writedlm

In [None]:
sel = trues(size(obslon))
sel[index] .= false

obslon_only_good_data = obslon[sel];
obslat_only_good_data = obslat[sel];
obsdepth_only_good_data = obsdepth[sel];
obstime_only_good_data = obstime[sel];
obsdata_only_good_data = obsval[sel];
obsids_only_good_data = obsid[sel];

@show size(obslon_only_good_data);
@show size(obslon);

In [None]:
writedlm("data/my_bad_data.txt",baddata)

The identifier is a combination of the EDMO code and LOCAL CDI ID

In [None]:
;cat data/my_bad_data.txt

In [None]:
obsid[200]

In [None]:
SDNObsMetadata(obsid[10])

## Select data according to criterion
For the purpose of the example, let's assume we want to remove the salinity values below 32 (even if the observations are good).

In [None]:
sel = obsval .> 32.;

index = findall(.!sel)
@info("Number of removed observations: $(length(index))");

obsval = obsval[sel]
obslon = obslon[sel]
obslat = obslat[sel]
obsdepth = obsdepth[sel]
obstime = obstime[sel]
obsid = obsid[sel];

In [None]:
checkobs((obslon,obslat,obsdepth,obstime),obsval,obsid)

Here we use a criterion based on the depth and on the time of measurement (month):

In [None]:
sel = (obsdepth .< 50.) .& (Dates.month.(obstime) .== 10)
@show sum(sel);
@show length(obsval);
obsval_new = obsval[sel];

Let's create a histogram showing the number of observations per month:

In [None]:
PyPlot.plt[:hist](Dates.month.(obstime),12)
extrema(Dates.month.(obstime))

## Plot the selected data positions

In [None]:
bathname = "data/gebco_30sec_16.nc"

if !isfile(bathname)
    download("https://b2drop.eudat.eu/s/o0vinoQutAC7eb0/download",bathname)
else
    @info "Bathymetry file already downloaded" 
end

bathisglobal = true

# Extract the bathymetry for plotting

lonr = extrema(obslon[sel])
latr = extrema(obslat[sel])

lonr = -10:30
latr = 30:45 
bx,by,b = extract_bath(bathname,bathisglobal,lonr,latr);

In [None]:
contourf(bx,by,permutedims(b, [2,1]), levels = [-1e5,0],colors = [[.5,.5,.5]])
scatter(obslon[sel],obslat[sel],10,obsval[sel])
# compute and set the correct aspect ratio
aspect_ratio = 1/cos(mean(latr) * pi/180)
gca()[:set_aspect](aspect_ratio)
colorbar(orientation = "horizontal")