-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tabular data container (data frames) #15
Comments
|
is being worked on by Prateek Nayak during gsoc 2019 |
|
CC @Kriyszig |
|
Yes, I will be working on this project. So far I have contacted the mentors and am exploring ndslice in mir-algorithms, while also looking into displaying the dataframe on the terminal with properly aligned columns. I'm a bit tight on time till this weekend because of final examination but after that I'll be working at my maximum capacity to realize the project. I'm mostly looking into Pandas and it's implementation of dataframes mostly because I have worked quite extensively with Python in the past. |
|
Interop with pandas via JSON and msgpack might be quite helpful. I have written a streaming msgpack decoder (using msgpack-d) to work with our own simple data frame implementation, and there is some old code for reading and writing to hdf5 too. |
|
Initial support for dataframe has been added to mir-algorithm. @safe pure unittest
{
import mir.ndslice.slice;
import mir.ndslice.allocation: slice;
import std.datetime.date;
auto dataframe = slice!(double, Date, string)(4, 3);
assert(dataframe.length == 4);
assert(dataframe.length!1 == 3);
assert(dataframe.elementCount == 4 * 3);
static assert(is(typeof(dataframe) ==
Slice!(double*, 2, Contiguous, Date*, string*)));
// Dataframe labels are contiguous 1-dimensional slices.
// Fill row labels
dataframe.label[] = [
Date(2019, 1, 24),
Date(2019, 2, 2),
Date(2019, 2, 4),
Date(2019, 2, 5),
];
assert(dataframe.label!0[2] == Date(2019, 2, 4));
// Fill column labels
dataframe.label!1[] = ["income", "outcome", "balance"];
assert(dataframe.label!1[2] == "balance");
// Change label element
dataframe.label!1[2] = "total";
assert(dataframe.label!1[2] == "total");
// Attach a newly allocated label
dataframe.label!1 = ["Income", "Outcome", "Balance"].sliced;
assert(dataframe.label!1[2] == "Balance");
} |
Pandas, R and Julia have made data
frames
very popular. As D is getting more interest from data scientist (e.g.
eBay or
AdRoll)
it would be very beneficial to use one language for the entire data
analysis pipeline - especially considering that D (in contrast to
popular languages like Python, R or Julia) - is compiled to native
machine code and gets optimized by the sophisticated LLVM backend.
Minimum requirements:
The text was updated successfully, but these errors were encountered: