This repository has been archived by the owner. It is now read-only.

LocalTransformProcessRecordReader can't handle filter ops #552

Closed
AlexDBlack opened this Issue Apr 9, 2018 · 0 comments

Comments

Projects
None yet
2 participants
@AlexDBlack
Copy link
Member

AlexDBlack commented Apr 9, 2018

https://gist.github.com/AlexDBlack/27395cd9ba1a5d0bb5a487c12dd36303

Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
	at io.netty.buffer.ArrowBuf.checkIndexD(ArrowBuf.java:139)
	at io.netty.buffer.ArrowBuf.chk(ArrowBuf.java:162)
	at io.netty.buffer.ArrowBuf.getByte(ArrowBuf.java:964)
	at org.apache.arrow.vector.BaseFixedWidthVector.isSet(BaseFixedWidthVector.java:798)
	at org.apache.arrow.vector.BaseFixedWidthVector.isNull(BaseFixedWidthVector.java:787)
	at org.datavec.arrow.recordreader.ArrowWritableRecordBatch.get(ArrowWritableRecordBatch.java:132)
	at org.datavec.arrow.recordreader.ArrowWritableRecordBatch.get(ArrowWritableRecordBatch.java:22)
	at org.datavec.local.transforms.LocalTransformProcessRecordReader.next(LocalTransformProcessRecordReader.java:37)
	at org.datavec.api.records.mapper.RecordMapper.copy(RecordMapper.java:88)
	at org.datavec.transform.basic.BasicDataVecExampleLocal.main(BasicDataVecExampleLocal.java:165)

Reason for the exception: a length 0 output, which I'm pretty sure is due to filtering.

Edit: unit test:

    @Test
    public void testLocalFilter(){

        List<List<Writable>> in = new ArrayList<>();
        in.add(Arrays.asList(new Text("Keep"), new IntWritable(0)));
        in.add(Arrays.asList(new Text("Remove"), new IntWritable(1)));
        in.add(Arrays.asList(new Text("Keep"), new IntWritable(2)));
        in.add(Arrays.asList(new Text("Remove"), new IntWritable(3)));

        Schema s = new Schema.Builder()
                .addColumnCategorical("cat", "Keep", "Remove")
                .addColumnInteger("int")
                .build();

        TransformProcess tp = new TransformProcess.Builder(s)
                .filter(new CategoricalColumnCondition("cat", ConditionOp.Equal, "Remove"))
                .build();

        RecordReader rr = new CollectionRecordReader(in);
        LocalTransformProcessRecordReader ltprr = new LocalTransformProcessRecordReader(rr, tp);

        List<List<Writable>> out = new ArrayList<>();
        while(ltprr.hasNext()){
            out.add(ltprr.next());
        }

        List<List<Writable>> exp = Arrays.asList(in.get(0), in.get(2));

        assertEquals(exp, out);
    }

@AlexDBlack AlexDBlack added the bug label Apr 9, 2018

@raver119 raver119 added the ETL label Apr 29, 2018

@AlexDBlack AlexDBlack self-assigned this May 8, 2018

AlexDBlack added a commit that referenced this issue May 8, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.