In [None]:
#!about

In [2]:
 var data = new[]
            {
                new {Type="orange", Price=1.2},
                new {Type="apple" , Price=1.3},
                new {Type="grape" , Price=1.4}
            };


In [29]:
. .\NotebookOutput.ps1

In [None]:
#!share --from csharp data
$tdr = $data | ConvertTo-TabularDataResource
$tdr.ExploreWithNteract()

# .NET Interactive ExtensionLab: Microsoft.Data.Analysis

This section demonstrates some of the experiments in the *ExtensionLab*  relating to the `DataFrame` class from [`Microsoft.Data.Analysis`](https://www.nuget.org/packages/Microsoft.Data.Analysis/).

## The `#!linqify` magic command

The `#!linqify` magic command builds a strongly-typed wrapper class around a `Microsoft.Data.Analysis.DataFrame` instance, which lets you write LINQ code against your data.  (You can learn more about `DataFrame` [here](https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/).)

To start, we'll add the `Microsoft.Data.Analysis` NuGet package.

In [None]:
#r "nuget:Microsoft.DotNet.Interactive.ExtensionLab,*-*"
#r "nuget:Microsoft.Data.Analysis,0.4.0"

In [None]:
using Microsoft.Data.Analysis;

var MyDataFrame = DataFrame.LoadCsv(@"wins.csv");

MyDataFrame.Columns

After running the previous cell, you can see that the `DataFrame` has columns with a few different data types. But since these are only known once the data is loaded, accessing them in a strongly-typed way isn't normally possible.

The `DataFrameRow` indexer returns `object`. So 
```c#
MyDataFrame.Rows[0][1].GetType()
```
returns `System.single`
But 

```c#
DataFrameRow row = myDataFrame.Rows[0];
Single value = row[0];
```
won't compile because the row indexer returns System.Object

This is where the `#!linqify` magic command we've installed from the ExtensionLab becomes useful. Since we know the column types in the `DataFrame` once it's been loaded, we can create a custom class with this understanding. And with .NET Interactive, we can do this at runtime, compile it, and replace the existing `MyDataFrame` variable with an instance of the new, more specific class.

`#!linqify --show-code True ` will let you see the code being used

In [38]:
#!linqify MyDataFrame

Now, you can write code to traverse the `DataFrame` using LINQ: 

In [None]:
MyDataFrame
   .OrderBy(row => row.HowMany)
   .ThenBy(row => row.WinningDriver)

## Visualizing the data with the nteract Data Explorer

The [nteract Data Explorer](https://blog.nteract.io/designing-the-nteract-data-explorer-f4476d53f897) is a powerful tool for understanding a dataset. Another experimental extension that we loaded when we installed the ExtensionLab package brings support for visualizing data from a number of types, including `IDataView`, which the `DataFrame` implements. The extension method `Explore` will render your data using the nteract Data Explorer:

In [None]:
#!share --from csharp MyDataFrame
$mydataframe | ConvertTo-TabularDataResource | % exploreWithnteract

# .NET Interactive ExtensionLab: Microsoft.Data.Analysis Example 2 

Download some interesting data. **WARNING** It's quite large and makes for a notebook which can be slow to load so clear the cell outputs afterwards

In [None]:
using System.IO;
using System.Net.Http;

string housingPath = "housing.csv";

if (!File.Exists(housingPath))
{
    var contents = await new HttpClient()
        .GetStringAsync("https://raw.githubusercontent.com/ageron/handson-ml2/master/datasets/housing/housing.csv");
        
    // The default working directory of the notebook is the same directory where the notebook file is located, 
    // so we'll write the file without fully-qualifying the path.
    File.WriteAllText("housing.csv", contents);
}

In [None]:
using Microsoft.Data.Analysis;

var housingDataFrame = DataFrame.LoadCsv(@"housing.csv");

housingDataFrame.Columns

After running the previous cell, you can see that the `DataFrame` has columns with a few different data types. But since these are only known once the data is loaded, accessing them in a strongly-typed way isn't normally possible.

The commented line in the next cell won't compile because the `DataFrameRow` indexer returns `object`.

In [44]:
DataFrameRow row = housingDataFrame.Rows[0];

// This next line won't compile because the row indexer returns System.Object
//Single value = row[0];

But as you can see next, the runtime type is more specific. 

In [45]:
housingDataFrame.Rows[0][0].GetType()

This is where the `#!linqify` magic command we've installed from the ExtensionLab becomes useful. Since we know the column types in the `DataFrame` once it's been loaded, we can create a custom class with this understanding. And with .NET Interactive, we can do this at runtime, compile it, and replace the existing `housingDataFrame` variable with an instance of the new, more specific class.

In [None]:
#!linqify -h

In [None]:
#!linqify --show-code True housingDataFrame

Now, you can write code to traverse the `DataFrame` using LINQ: 

In [None]:
housingDataFrame
    .OrderBy(row => row.ocean_proximity)
    .ThenBy(row => row.median_house_value)

## Visualizing the data with the nteract Data Explorer

The [nteract Data Explorer](https://blog.nteract.io/designing-the-nteract-data-explorer-f4476d53f897) is a powerful tool for understanding a dataset. 

The was experimental support in ExtensionLab package for visualizing data from a number of types, including `IDataView`, which the `DataFrame` implements, but this seems to have been removed, but we can use PowerShell to do it 

In [None]:
$tdr = import-csv .\housing.csv | ConvertTo-TabularDataResource 
$tdr.explorewithSandDance()

In [None]:
$tdr.ExploreWithNteract()