Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frame.ColumnTypes incorrectly changes to System.Object when filtering frame to rows where the column's values are null #516

Open
ppatino opened this issue Sep 22, 2020 · 1 comment

Comments

@ppatino
Copy link

ppatino commented Sep 22, 2020

Issue description

I am running into an issue where a Frame is "losing" its column's initial data types when the Frame is filtered to only contain rows where the column's values are missing. Pardon the C#-isms in advance, I am using Deedle from C# code, but hopefully this is all clear.

Steps to reproduce the issue

  1. Start with a Frame where one of the columns can be null. In this example, we start with a frame where the columns are of type string ("Name" field) and int? ("Age" field).
  2. Inspecting the frame.ColumnTypes directly after the frame is created below using Frame.FromRecords results in the expected types of string and int? being output.
  3. Create a new frame by filtering out rows where the nullable column has values, i.e. filter to rows where no row has values for a given column. In this simple case where we have 2 Person records, I filter to index 0 aka the "Alice" record where Age is null.
  4. Inspecting the filtered.ColumnTypes produces an unexpected result of the "Age" column having a type of System.Object.
public class Person
{
	public int? Age;
	public string Name;
}
Person[] records = new Person[]
{
	new Person() { Name="Alice", Age = null},
	new Person() { Name="Bob", Age = 45}
};

Frame<int, string> frame = Frame.FromRecords(records);
//Output of `frame.ColumnTypes` is the expected `string`, `int?`

Frame<int, string> filtered = frame.Where(c => c.Key == 0);
Frame<int, string> filtered =  Frame.FromRows(frame.Rows.Where(c => c.Key == 0));
//After filtering (done in the 2 different ways I am aware of for filtering rows), the `filtered.ColumnTypes` property returns types
//`string` and `Object` when that 2nd type should still be `int?`

What's the expected result?

  • After filtering, the ColumnTypes pre and post-filtering should not change.

What's the actual result?

  • After filtering, the type of the column that only had null / missing values changes to System.Object from whatever its previous, correct type was (in my example, after filtering, the type of the Age column changes from typeof(Int32?) to typeof(Object).
@ppatino ppatino changed the title ColumnTypes change to System.Object when filtering frame to rows where the column are null frame.ColumnTypes incorrectly changes to System.Object when filtering frame to rows where the column's values are null Sep 22, 2020
@ppatino
Copy link
Author

ppatino commented Sep 22, 2020

I realized I can also do row filtering by using:

frame.RealignRows(frame.RowKeys.Where(rk => rk == 0));

in which case the rows are appropriately filtered (in this case simply to index 0) AND the frame.ColumnTypes appears to be correct after this (i.e. types are string, int?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant