In this example we are going to load a CSV file into a dataframe, filter the dataframe to houses under $250,000 and write the results to a new CSV file.

First we need to set some things up.

In [2]:
#r "nuget: Microsoft.Data.Analysis, 0.21.1"

using System.IO;
using System.Linq;
using Microsoft.Data.Analysis;

// Define data path
var dataPath = Path.GetFullPath(@"data/home-sale-prices-1000.csv");

Loading extensions from `C:\Users\brad\.nuget\packages\microsoft.data.analysis\0.21.1\interactive-extensions\dotnet\Microsoft.Data.Analysis.Interactive.dll`

Now Load the CSV into a dataframe.

In [6]:
var dataFrame = DataFrame.LoadCsv(dataPath);

Optionally, run the block below to display a description of the data that was loaded

In [18]:
dataFrame.Description()

index,Description,Id,Size,HistoricalPrice,CurrentPrice
0,Length (excluding null values),1000.0,1000.0,1000.0,1000.0
1,Max,1000.0,4994.0,499953.0,599370.0
2,Min,1.0,1003.0,100126.0,151969.0
3,Mean,500.5,2981.518,301128.2,369356.97


Now Filter the dataframe

In [24]:
PrimitiveDataFrameColumn<bool> boolFilter = dataFrame["CurrentPrice"].ElementwiseLessThan(250000);
DataFrame filteredDataFrame = dataFrame.Filter(boolFilter);


Optionally, describe the filtered output

In [16]:
filteredDataFrame.Description()

index,Description,Id,Size,HistoricalPrice,CurrentPrice
0,Length (excluding null values),222.0,222.0,222.0,222.0
1,Max,995.0,4925.0,496646.0,249853.0
2,Min,14.0,1004.0,104263.0,151969.0
3,Mean,550.5,3005.91,297502.47,199655.19


Now write the filtered dataframe out to a new csv file.

In [19]:
DataFrame.SaveCsv(filteredDataFrame, "data/result.csv", ',');

We can also display the filtered dataframe in a grid

In [21]:
filteredDataFrame

index,Id,Size,HistoricalPrice,CurrentPrice
⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️


Let's try a simple transform on the filtered dataframe.  Let's find the difference between the current price and the historical price to see if our home values are going up or down.

In [25]:
filteredDataFrame["PriceDifference"] = filteredDataFrame["CurrentPrice"] - filteredDataFrame["HistoricalPrice"];
filteredDataFrame

index,Id,Size,HistoricalPrice,CurrentPrice,PriceDifference
⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️
