Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

How to read rec files in R #6943

Closed
lichen11 opened this issue Jul 6, 2017 · 8 comments
Closed

How to read rec files in R #6943

lichen11 opened this issue Jul 6, 2017 · 8 comments
Labels

Comments

@lichen11
Copy link

lichen11 commented Jul 6, 2017

I use im2rec.py in mxnet to generate rec files for training and validation. I am wondering how to access the content of the rec files, such as the indices, labels, or images in R.
Say val is the variable name for the rec file I generated and loaded in R.
I used the command
val@.xData,
trying to see if I can get the indices or labels, my R crashes.

@lichen11
Copy link
Author

lichen11 commented Jul 6, 2017

I figured that I can read the .lst file to find out the labels.
However I am still curious to find out how to access the data content in Rec file.

@lichen11 lichen11 closed this as completed Jul 6, 2017
@thirdwing
Copy link
Contributor

Please try the code below. I used the cifar10 data from http://data.mxnet.io/mxnet/data/cifar10.zip

dataiter <- mx.io.ImageRecordIter(
  path.imgrec     = "./data/cifar/train.rec",
  path.imglist    = "./data/cifar/train.lst",
  mean.img        = "./data/cifar/cifar10_mean.bin",
  batch.size      = 100,
  data.shape      = c(28, 28, 3),
  rand.crop       = TRUE,
  rand.mirror     = TRUE
)

dataiter$reset()

dataiter$iter.next()

dataiter$value()$label

dataiter$value()$data

@thirdwing
Copy link
Contributor

You can get more info on how to use iterators from https://github.com/dmlc/mxnet/blob/master/R-package/tests/testthat/test_io.R

@thirdwing thirdwing added the R label Jul 6, 2017
@lichen11
Copy link
Author

lichen11 commented Jul 6, 2017

Thank you!
When I tried to access val$value()$label, my R would just crash. The image size is 2137 and the images are 224 by 224. I don't think this is a large file for it to crash though.

@thirdwing
Copy link
Contributor

Please make sure the following two lines have been executed.

dataiter$reset()
dataiter$iter.next()

@lichen11
Copy link
Author

lichen11 commented Jul 6, 2017

Thank you! Now I can see the label and the data. However, I also notice that I only see 8 labels instead of all 2137 labels. In my input shape, I set the batch size to be 8.
data <- get_iterator(data_shape = c(224, 224, 3),
train_data = "train.rec",
val_data = "val.rec",
batch_size = 8)
train <- data$train
val <- data$val
Maybe this is why only 8 samples are showing. How do I access all the labels?

@thirdwing
Copy link
Contributor

This is just what batch_size means. If you want to get all the labels, you can use a loop or the number of labels as your batch size.

@lichen11
Copy link
Author

lichen11 commented Sep 18, 2017

Hi, I have a follow-up question:
dataiter <- mx.io.ImageRecordIter(
path.imgrec = "./data/cifar/train.rec",
path.imglist = "./data/cifar/train.lst",
mean.img = "./data/cifar/cifar10_mean.bin",
batch.size = 100,
data.shape = c(28, 28, 3),
rand.crop = TRUE,
rand.mirror = TRUE
)
dataiter$reset()
dataiter$iter.next()

I assign variables

labels=dataiter$value()$label
mydata = dataiter$value()$data

dim(mydata) = 28 28 3 100

but if I do

mydata[,,,1] or labels[1:10]
Then an error occurs:
"Object of type 'externalptr' is not subsettable. "

How do I access the values from mydata and labels?

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants