Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several sstables used as golden files in write tests for MC format fail to work with sstabledump #4043

Closed
haaawk opened this issue Jan 2, 2019 · 15 comments
Assignees
Milestone

Comments

@haaawk
Copy link
Contributor

haaawk commented Jan 2, 2019

golden_sstables_dumps.txt was generated using the following bash command:

for i in ../scylla/tests/sstables/3.x/uncompressed/write_*; do echo "$i/mc-1-big-Data.db" >> ~/golden_sstables_dumps.txt 2>&1 && ./tools/bin/sstabledump $i/mc-1-big-Data.db >> ~/golden_sstables_dumps.txt 2>&1; done

Following sstables fail:

  1. scylla/tests/sstables/3.x/uncompressed/write_compact_table/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.serializers.Int32Serializer.deserialize(Int32Serializer.java:31)
        at org.apache.cassandra.serializers.Int32Serializer.deserialize(Int32Serializer.java:25)
        at org.apache.cassandra.db.marshal.Int32Type.toJSONString(Int32Type.java:100)
        at org.apache.cassandra.tools.JsonTransformer.serializeClustering(JsonTransformer.java:350)
        at org.apache.cassandra.tools.JsonTransformer.serializeRow(JsonTransformer.java:247)
        at org.apache.cassandra.tools.JsonTransformer.serializePartition(JsonTransformer.java:215)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at org.apache.cassandra.tools.JsonTransformer.toJson(JsonTransformer.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:242)
  1. scylla/tests/sstables/3.x/uncompressed/write_counter_table/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_different_types/mc-1-big-Data.db
Exception in thread "main" org.apache.cassandra.exceptions.ConfigurationException: Unable to find abstract-type class 'org.apache.cassandra.db.marshal.DurationType'
        at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:464)
        at org.apache.cassandra.db.marshal.TypeParser.getAbstractType(TypeParser.java:334)
        at org.apache.cassandra.db.marshal.TypeParser.parse(TypeParser.java:84)
        at org.apache.cassandra.db.SerializationHeader$Serializer.readType(SerializationHeader.java:559)
        at org.apache.cassandra.db.SerializationHeader$Serializer.readColumnsWithType(SerializationHeader.java:546)
        at org.apache.cassandra.db.SerializationHeader$Serializer.deserialize(SerializationHeader.java:499)
        at org.apache.cassandra.db.SerializationHeader$Serializer.deserialize(SerializationHeader.java:409)
        at org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:123)
        at org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:94)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:102)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.db.marshal.DurationType
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:460)
        ... 10 more
  1. scylla/tests/sstables/3.x/uncompressed/write_many_deleted_partitions/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_many_live_partitions/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_many_range_tombstones/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_overlapped_range_tombstones/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_overlapped_start_range_tombstones/mc-1-big-Data.db
Exception in thread "main" java.lang.NullPointerException
        at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:412)
        at org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
  1. scylla/tests/sstables/3.x/uncompressed/write_shadowable_deletion/mc-1-big-Data.db
Exception in thread "main" org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /home/haaawk/haaawk/scylla/tests/sstables/3.x/uncompressed/write_shadowable_deletion/mc-1-big-Data.db
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:112)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30)
        at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
        at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
        at org.apache.cassandra.tools.JsonTransformer.serializePartition(JsonTransformer.java:210)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at org.apache.cassandra.tools.JsonTransformer.toJson(JsonTransformer.java:104)
        at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:242)
Caused by: java.io.EOFException: EOF after 1 bytes out of 4
        at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
        at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:404)
        at org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:413)
        at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserializeValuesWithoutSize(ClusteringPrefix.java:346)
        at org.apache.cassandra.db.Clustering$Serializer.deserialize(Clustering.java:163)
        at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:416)
        at org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:373)
        at org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:87)
        at org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:65)
        at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.doCompute(SSTableIdentityIterator.java:123)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:100)
        ... 18 more
@haaawk
Copy link
Contributor Author

haaawk commented Jan 2, 2019

@bhalevy

@haaawk
Copy link
Contributor Author

haaawk commented Jan 2, 2019

golden_sstables_dumps-scylla.txt was generated with the same command as previously but using sstabledump from scylla-java-tools not from Cassandra.

The same sstables fail except for one:

scylla/tests/sstables/3.x/uncompressed/write_different_types/mc-1-big-Data.db

which works ok with sstabledump from scylla-java-tools.

@haaawk
Copy link
Contributor Author

haaawk commented Jan 2, 2019

Running with Benny's pathset (https://github.com/bhalevy/scylla/commits/projects/gc_clock_64/WIP) makes no difference. Same sstables are failing.

@haaawk
Copy link
Contributor Author

haaawk commented Jan 2, 2019

I checked and all those sstables are actually used in the tests.

@argenet
Copy link
Contributor

argenet commented Jan 2, 2019

For write_compact_table, there is a bug in Cassandra/sstabledump that I have reported as CASSANDRA-14486.
For write_shadowable_deletion, the failure is expected as we use a Scylla-specific extension to write the second tombstone that Cassandra does not support or recognize.
Scylla's sstabledump fork should be extended appropriately to recognize the HAS_SCYLLA_SHADOWABLE_DELETION flag and read/output the corresponding tombstone.

For others, that's odd and unfortunate they fail, I clearly remember at least some of them succeeding earlier. Most files can be generated using the CQL statements from tests comments to double check the validity, may be a bit harder for *_many_partitions tests subset as I have generated those using a Python script.

I'd suspect that most if not all failures are caused by something around Statistics.db as data files are binary-compared, but that's merely a guess.

@argenet
Copy link
Contributor

argenet commented Jan 2, 2019

Btw, if that helps, I used to use the sstabledump shipped with Cassandra 3.11.2, not sure if there are any changes to it done in 3.11.3, just in case.

@haaawk
Copy link
Contributor Author

haaawk commented Jan 2, 2019

Thanks @argenet - this is very useful. I will focus first on sstables other than write_compact_table, write_shadowable_deletion and *_many_partitions then.

@argenet
Copy link
Contributor

argenet commented Jan 2, 2019

I looked a bit closer and I think all failures except write_different_types are easy to explain. Those files don't have a Statistics.db file that is required for sstabledump to successfully output the data from a Data.db file. So, strictly speaking, they are not bugs.

Statistics.db have been disabled for binary comparison in 8f686af and later those files have been removed in 3bbb013

I have re-added Statistics.db for most of the write_ tests in bb24d37 but not for *_many_partitions ones as those can only be generated using scripts so that would take some more work.
If the plan is to make all the golden SSTables copies work with sstabledump, I suggest those files should be re-generated with a Python script using the same parameters as those used by the unit tests code.

Same applies to the write_counter_test because that one cannot be directly reproduced with the same values as counters have unique UUIDs generated for incremental updates.

That basically leaves us with the only failing write_different_types test that fails with a Unable to find abstract-type class 'org.apache.cassandra.db.marshal.DurationType' message.
Clearly something about the Duration type representation there but that one should be relatively easy to troubleshoot using the CQL query statements to re-generate it with Cassandra.

@slivne
Copy link
Contributor

slivne commented Jan 3, 2019

thanks @argenet

@bhalevy
Copy link
Member

bhalevy commented Jan 3, 2019

I tried copying the required Statistics files from sstable_3_x_test output directories.
And verified with sstabledump (from scylla-tools-java). So besides the expected failure with write_shadowable_deletion, there is only write_different_types:
Exception in thread "main" org.apache.cassandra.exceptions.ConfigurationException: Unable to find abstract-type class 'org.apache.cassandra.db.marshal.DurationType'
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:440)

FWIW, src/java/org/apache/cassandra/db/marshal/DurationType.java exists in the cassandra tree but not in scylla-tools-java.

Other than that, tests/sstables/3.x/uncompressed/write_static_row/mc-1-big-Data.db:
ERROR 08:26:58 Missing component: /local/home/bhalevy/dev/scylla/tests/sstables/3.x/uncompressed/write_static_row/mc-1-big-

There is an empty line in tests/sstables/3.x/uncompressed/write_static_row/mc-1-big-TOC.txt

@haaawk
Copy link
Contributor Author

haaawk commented Jan 3, 2019

@bhalevy BTW write_different_types fails with cassandra's sstabledump but works fine with the one from scylla-java-tools.

@argenet
Copy link
Contributor

argenet commented Jan 3, 2019

For write_different_types, I suggest that you try to do the following:

  1. Re-generate SSTables with the statements from test comments using Cassandra.
  2. Check that the data file matches the one currently stored as a golden copy.
  3. Try to dump this new file using sstabledump shipped along with Cassandra
  4. Try to dump this new file using sstabledump from scylla-java-tools

This may as well be an sstabledump bug like it was with compact tables (see above) so it makes sense to rule this out before digging further into Cassandra/Scylla code.

@haaawk
Copy link
Contributor Author

haaawk commented Jan 3, 2019

I tried with a newer version of sstabledump and it worked - I guess I was just using too old version.

@slivne
Copy link
Contributor

slivne commented Jan 6, 2019

bottom line we need to add support for the shadowable tombstone in sstabledump (#4056)

merge the patches that "reintroduce" the statistics.db back into the repo.

@slivne slivne added this to the 3.1 milestone Jan 6, 2019
avikivity pushed a commit that referenced this issue Jan 6, 2019
Refs #4043

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190103112511.23488-1-bhalevy@scylladb.com>
avikivity pushed a commit that referenced this issue Jan 6, 2019
To be able to verify the golden version with sstabledump.
These files were generated by running sstable_3_x_test and keeping its
generated output files.

Refs #4043

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190103112511.23488-2-bhalevy@scylladb.com>
pdziepak pushed a commit that referenced this issue Jan 7, 2019
Refs #4043

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190103112511.23488-1-bhalevy@scylladb.com>
pdziepak pushed a commit that referenced this issue Jan 7, 2019
To be able to verify the golden version with sstabledump.
These files were generated by running sstable_3_x_test and keeping its
generated output files.

Refs #4043

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190103112511.23488-2-bhalevy@scylladb.com>
@haaawk
Copy link
Contributor Author

haaawk commented Jan 18, 2019

#4056 is fixed now so I guess we could close this one too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants