-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[Java] Out of order writes using setSafe #17197
Description
I noticed that calling setSafe on a VarCharVector with indices not in increasing order causes the lastIndex to be set to the index in the last call to setSafe.
Is this a documented and expected behavior ?
Sample code:
import java.util.Collections;
import lombok.extern.slf4j.Slf4j;
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.VarCharVector;
import org.apache.arrow.vector.VectorSchemaRoot;
import org.apache.arrow.vector.types.pojo.ArrowType;
import org.apache.arrow.vector.types.pojo.Field;
import org.apache.arrow.vector.types.pojo.Schema;
import org.apache.arrow.vector.util.Text;
@Slf4j
public class ATest {
public static void main() {
Schema schema = new Schema(Collections.singletonList(Field.nullable("Data", new ArrowType.Utf8())));
try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new RootAllocator())) {
VarCharVector vec = (VarCharVector) vroot.getVector("Data");
for (int i = 0; i < 10; i++) {
vec.setSafe(i, new Text(Integer.toString(i) + "_mtest"));
}
vec.setSafe(7, new Text(Integer.toString(7) + "_new"));
log.info("Data at index 8 Before {}", vec.getObject(8));
vroot.setRowCount(10);
log.info("Data at index 8 After {}", vec.getObject(8));
log.info(vroot.contentToTSVString());
}
}
}
If I don't set the index 7 after the loop, I get all the 0_mtest, 1_mtest, ..., 9_mtest entries.
If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 6_mtext, 7_new,
Before the setRowCount, the data at index 8 is -> st8_mtest ; index 9 is 9_mtest
After the setRowCount, the data at index 8 is -> "" ; index 9 is ""
With a text with more chars instead of 4 with _new, it keeps eating into the data at the following indices.
Reporter: Saurabh
Assignee: Liya Fan / @liyafan82
PRs and other links:
Note: This issue was originally created as ARROW-8909. Please see the migration documentation for further details.