# Using DwC-A_dotnet.Interactive

This notebook describes how to use DwC-A_dotnet and DwC-A_dotnet.Interactive to interactively query Darwin Core Archive files and plot your results.

Information on the dotnet libraries used here may be found at 

|Library|Link|
|---|---|
|DwC-A_dotnet|https://github.com/pjoiner/DwC-A_dotnet|
|DwC-A_dotnet.Interactive|https://github.com/pjoiner/DwC-A_dotnet.Interactive|


## Installation

Use the #r magic command to install the libraries from NuGet.

In [3]:
#r "nuget:DwC-A_dotnet,0.5.1"
#r "nuget:DwC-A_dotnet.Interactive,0.1.1-Pre"

Installed package DwC-A_dotnet.Interactive version 0.1.1-pre

Installed package DwC-A_dotnet version 0.5.1

Loaded DwC_A_dotnet.Interactive.DwCKernelExtension

## Open An Archive
Use the `ArchiveReader` class to open the archive and provide the path to your archive.  It is recommended that the archive be unzipped to a directory first to reduce the overhead of creating a temporary folder to unzip the archive.  If you use the zip file remember to dispose of the temporary working directory at the end of you session by calling `archive.Dispose();`

In [4]:
using DwC_A;

var archive = new ArchiveReader(@"./data/dwca-rooftop-v1.4");

## Archive MetaData
The interactive extensions library (`DwC-A_dotnet.Interactive`) registers kernel extensions to display various archive metadata by simply entering MetaData at the end of a cell.  The same can be done for an `IFileReader` instance to get a list of the term metadata.

In [None]:
archive.MetaData

File Type,File Name,Row Type
CoreFile:,event.txt,http://rs.tdwg.org/dwc/terms/Event
Extension:,occurrence.txt,http://rs.tdwg.org/dwc/terms/Occurrence


In [None]:
archive.CoreFile

Index,Name,Term,Vocabulary
0,id,id,<null>
1,type,http://purl.org/dc/terms/type,<null>
2,license,http://purl.org/dc/terms/license,<null>
3,rightsHolder,http://purl.org/dc/terms/rightsHolder,<null>
4,ownerInstitutionCode,http://rs.tdwg.org/dwc/terms/ownerInstitutionCode,<null>
5,eventID,http://rs.tdwg.org/dwc/terms/eventID,<null>
6,samplingProtocol,http://rs.tdwg.org/dwc/terms/samplingProtocol,<null>
7,sampleSizeValue,http://rs.tdwg.org/dwc/terms/sampleSizeValue,<null>
8,sampleSizeUnit,http://rs.tdwg.org/dwc/terms/sampleSizeUnit,<null>
9,samplingEffort,http://rs.tdwg.org/dwc/terms/samplingEffort,<null>


In [None]:
archive.Extensions.GetFileReaderByFileName("occurrence.txt")

Index,Name,Term,Vocabulary
0,id,id,<null>
1,type,http://purl.org/dc/terms/type,<null>
2,license,http://purl.org/dc/terms/license,<null>
3,rightsHolder,http://purl.org/dc/terms/rightsHolder,<null>
4,institutionCode,http://rs.tdwg.org/dwc/terms/institutionCode,<null>
5,ownerInstitutionCode,http://rs.tdwg.org/dwc/terms/ownerInstitutionCode,<null>
6,basisOfRecord,http://rs.tdwg.org/dwc/terms/basisOfRecord,<null>
7,occurrenceID,http://rs.tdwg.org/dwc/terms/occurrenceID,<null>
8,recordedBy,http://rs.tdwg.org/dwc/terms/recordedBy,<null>
9,individualCount,http://rs.tdwg.org/dwc/terms/individualCount,<null>


## ToDynamic Extension Methods
A row may be converted to a `dynamic` object using the `ToDynamic()` function from the `DwC_A.Interactive.Extensions` namespace.  Each field will be accessable by using the short name for the Term for a row.  The following cell displays a subset of event data from the first 5 rows of the dataset.

In [None]:
using DwC_A_dotnet.Interactive.Extensions;

//This will return an IEnumerable<dynamic>
var dynRows = archive.CoreFile.ToDynamic();
foreach(var dynRow in dynRows.Take(5))
{
    var event1 = new 
    {
        dynRow.eventID,
        dynRow.eventDate,
        dynRow.sampleSizeValue
    };
    display(event1);
}

eventID,eventDate,sampleSizeValue
urn:zmuc:2006-07-14/2006-07-20,2006-07-14/2006-07-20,6


eventID,eventDate,sampleSizeValue
urn:zmuc:1993-05-24/1993-05-31,1993-05-24/1993-05-31,7


eventID,eventDate,sampleSizeValue
urn:zmuc:1997-07-21/1997-07-21,1997-07-21/1997-07-21,0


eventID,eventDate,sampleSizeValue
urn:zmuc:1998-05-27/1998-06-01,1998-05-27/1998-06-01,5


eventID,eventDate,sampleSizeValue
urn:zmuc:1998-06-19/1998-06-21,1998-06-19/1998-06-21,2


## Query Data Using LinQ

The following cell uses LinQ to gather a list of total individual counts of each genus for a specific sampling event.  Change the number in the `.Skip(1)` line to see totals calculated for other events. 

In [None]:
var eventID = archive.CoreFile.ToDynamic()
    .Skip(1)  //<== Change this number to see other events
    .First()
    .eventID;

display(eventID);

var eventOccurrences = archive.Extensions.GetFileReaderByFileName("occurrence.txt").ToDynamic();    

var data = eventOccurrences
    .Where(n => n.eventID == eventID)
    .GroupBy(n => n.genus)
    .Select(g => new{
        genus = g.First().genus,
        count = g.Sum(c => int.Parse(c.individualCount))
    });
data

urn:zmuc:1993-05-24/1993-05-31

index,genus,count
0,Abrostola,1
1,Acronicta,3
2,Agrotis,266
3,Apamea,1
4,Aproaerema,1
5,Argyresthia,1
6,Autographa,1
7,Biston,3
8,Caloptilia,3
9,Celypha,1
