-
-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: example of DataFrame export to HDF5 and import into R #9636
Comments
This looks very helpful! Would you like to submit a PR that adds a link to this issue in the documentation? |
Yes, I'll do that, but it might take some time. |
closed by #9661 |
This is helpful but what about if the format is set to 'table'. The provided function doesn't seem to work for this situation. Python store.put("dataframe", df, format = 'table', data_columns=df.columns) R > loadhdf5data(h5File)
data frame with 0 columns and 0 rows |
For table format you may use rhdf5 directly (non-working exerpts): Python with pd.HDFStore(out_name, mode="w", complib=str("zlib"),
complevel=5) as hdf_store:
# Write some data
hdf_store.append("features", job_data.loc[:, feat_columns],
format="table", index=False)
hdf_store.append("labels", job_data.loc[:, label_columns],
format="table", data_columns=label_columns, index=False) R: library(rhdf5)
loadFeatures <- function(h5File) {
# Load feature values from separate HDF5 tables into data.frame object
#
# Args:
# h5File: filename of HDF5 file to be loaded. It has to contain two tables:
# "/features/table" with feature values and "/labels/table" with
# corresponding block labels.
#
# Returns:
# A data.frame with feature values and block labels
labels <- h5read(h5File, "/labels/table", read.attributes = FALSE)
featTable <- h5read(h5File, "/features/table", compoundAsDataFrame = FALSE)
feats <- data.frame(t(featTable$values_block_0))
# data format conversion is application specific
feats$job <- factor(labels$job)
feats$layer <- factor(labels$layer)
feats$block <- labels$block
feats$isElevated <- as.logical(labels$is_elevated)
feats$partLabel <- labels$part_label
return(feats)
}
feats <- loadFeatures(few_h5File) It's been a while and I haven't used pandas and R in combination since, but this should get you started. |
@joschkazj Thanks for sharing your function. I am having issues with integers larger than 32 bit. For example, if I create the data frame:
The I get this warning message:
However, adding Do you have any idea how I can fix this issue? |
To fix this bit64 issue: How can I load a data frame saved in pandas as an HDF5 file in R without losing integers larger than 32 bit?. (In short: install.packages("bit64”)+ library(bit64) + added bit64conversion='bit64' twice in loadhdf5data) |
When searching the web I didn't find any examples of a working pandas to R data transfer using HDF5 files, even though pandas's documentation mentions the used HDF5 format "can easily be imported into R using the rhdf5 library". The pandas export works as expected and I inspected the file format using the HDF group's viewer (HDFView).
After some experimentation I have a working sample for dataframe export from Python/pandas and import into R, which could be added to the documentation to help future users:
Output:
Now you can import the DataFrame:
I hope this helps someone. :-)
The text was updated successfully, but these errors were encountered: