Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException during serialization #178

Closed
hpx7 opened this issue Dec 13, 2017 · 33 comments
Closed

ArrayIndexOutOfBoundsException during serialization #178

hpx7 opened this issue Dec 13, 2017 · 33 comments

Comments

@hpx7
Copy link

hpx7 commented Dec 13, 2017

Seeing this sporadically for FrequentItems:

java.lang.ArrayIndexOutOfBoundsException: 13
	at com.yahoo.sketches.frequencies.ReversePurgeItemHashMap.getActiveValues(ReversePurgeItemHashMap.java:180)
	at com.yahoo.sketches.frequencies.ItemsSketch.toByteArray(ItemsSketch.java:316)

as well as Quantiles:

java.lang.ArrayIndexOutOfBoundsException: null
	at java.lang.System.arraycopy(Native Method)
	at com.yahoo.sketches.quantiles.ItemsByteArrayImpl.combinedBufferToItemsArray(ItemsByteArrayImpl.java:104)
	at com.yahoo.sketches.quantiles.ItemsByteArrayImpl.toByteArray(ItemsByteArrayImpl.java:55)
	at com.yahoo.sketches.quantiles.ItemsSketch.toByteArray(ItemsSketch.java:471)
	at com.yahoo.sketches.quantiles.ItemsSketch.toByteArray(ItemsSketch.java:461)

Has anyone seen this before? Might it be related to memory corruption as we suspected in #175?

@jmalkin
Copy link
Contributor

jmalkin commented Dec 14, 2017

We have not seen this. Please provide more context for the errors: Which jar version? Are you running via java directly, hive, pig, druid, etc? Which serde are you using with these sketches? If you can get it to print the sketch preamble info, that'd also be useful.

@hpx7
Copy link
Author

hpx7 commented Dec 14, 2017

  • Using sketches-core 0.10.3
  • Running this in spark
  • Using ArrayOfNumbersSerDe/ArrayOfStringsSerDe

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Dec 14, 2017 via email

@hpx7
Copy link
Author

hpx7 commented Dec 14, 2017

Yeah we use Java serialization (we use toByteArray and just write the byte array to an ObjectOutputStream).

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Dec 14, 2017 via email

@hpx7
Copy link
Author

hpx7 commented Dec 18, 2017

Looks something like this:

public final class NumericDistribution {

    private static final Comparator<Number> COMPARATOR = Comparator.comparing(Number::doubleValue);

    private ItemsSketch<Number> sketch;

    public NumericDistribution(int distributionK) {
        this(ItemsSketch.getInstance(distributionK, COMPARATOR));
    }

    public NumericDistribution(byte[] data) {
        this(ItemsSketch.getInstance(Memory.wrap(data), COMPARATOR, new ArrayOfNumbersSerDe()));
    }

    public void update(Number item) {
        // ignore NaN, Infinity, -Infinity
        if (Double.isFinite(item.doubleValue())) {
            sketch.update(item);
        }
    }

    public void merge(NumericDistribution other) {
        ItemsUnion<Number> union = ItemsUnion.getInstance(sketch);
        union.update(other.sketch);
        sketch = union.getResult();
    }

    private void writeObject(ObjectOutputStream outputStream) throws IOException {
        outputStream.writeObject(sketch.toByteArray(new ArrayOfNumbersSerDe()));
    }

    private void readObject(ObjectInputStream inputStream) throws ClassNotFoundException, IOException {
        byte[] data = (byte[]) inputStream.readObject();
        sketch = new NumericDistribution(data).sketch;
    }
}

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Dec 19, 2017 via email

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Dec 19, 2017 via email

@hpx7
Copy link
Author

hpx7 commented Dec 20, 2017

Appreciate the pointers. Will try this and report back.

@szunami
Copy link

szunami commented Jan 16, 2018

I'm a colleague of @hpx7. We added the above logging to our code and received the following dump:

### WritableMemoryImpl SUMMARY ###
Header Comment      : 
Call Params         : .toHexString(..., 0, 4046), hashCode: 291048800
NativeBaseOffset    : 0
UnsafeObj, hashCode : byte[], 212194479
UnsafeObjHeader     : 16
ByteBuf, hashCode   : null
RegionOffset        : 0
Capacity            : 4046
CumBaseOffset       : 16
MemReq, hashCode    : DefaultMemoryManager, 593466056
Valid   : true
Resource Read Only  : false
Resource Endianness : LITTLE_ENDIAN
JDK Version         : 8
Data, littleEndian  : < data omitted>

And the exception includes the following message:1

"reqOffset: 4030, reqLength: , (reqOff + reqLen): 4031, allocSize: 4030"

This error occurs many times when building a Distribution of a given Spark column. Interestingly, for each of these errors, capacity - allocSize = 16 (capacity as described by the memory header, and allocSize described by the exception message). This happens to coincide with both the UnsafeObjHeader size and the CumBaseOffset as described by the memory header.

Any insight @AlexanderSaydakov ? Thanks very much for the help thus far.

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 16, 2018 via email

@leerho
Copy link
Contributor

leerho commented Jan 16, 2018

Guys, we are trying to help you, but you but we can't unless we get full stack traces.

The message:

"reqOffset: 4030, reqLength: , (reqOff + reqLen): 4031, allocSize: 4030"

comes from UnsafeUtil.assertBounds(). There are probably a hundred places where this is called from the Memory code and it is used to not only check internal Memory bounds but also bounds on primitive arrays (byte[], long[], etc) that are being put into memory or extracted out of memory ("get...").

The fact that the error message (allocSize = 4030) is different from the Memory.toHex() message you printed out, (Capacity = 4046) means that they are unrelated.

Either the error message comes from a different Memory resource OR the message comes from a check on the allocation size of a primitive array being placed into Memory, or the allocation of an array meant to receive data from Memory.

We need the full stack trace of what leads up to the above error message, and I mean ALL traces, please. Otherwise, we will be going around this loop forever.

@szunami
Copy link

szunami commented Jan 16, 2018

Here is the stacktrace that corresponds to the above dump.

java.lang.IllegalArgumentException: reqoffset: 4030, reqlength: , (reqoff + reqlen): 4031, allocsize: 4030
	at com.yahoo.memory.unsafeutil.checkbounds(unsafeutil.java:156)
	at com.yahoo.sketches.arrayofnumbersserde.deserializefrommemory(arrayofnumbersserde.java:93)
	at com.yahoo.sketches.arrayofnumbersserde.deserializefrommemory(arrayofnumbersserde.java:25)
	at com.yahoo.sketches.quantiles.itemssketch.getinstance(itemssketch.java:200)

This stacktrace is printed along with the memory dump in the catch/rethrow block that we added as per @AlexanderSaydakov 's suggestion.

@hpx7
Copy link
Author

hpx7 commented Jan 16, 2018

The code this comes from:

    public NumericDistribution(byte[] data) {
        this(ItemsSketch.getInstance(Memory.wrap(data), COMPARATOR, new ArrayOfNumbersSerDe()));
    }

    private void readObject(ObjectInputStream inputStream) throws ClassNotFoundException, IOException {
        byte[] data = (byte[]) inputStream.readObject();
        sketch = new NumericDistribution(data).sketch;
    }

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 16, 2018 via email

@leerho
Copy link
Contributor

leerho commented Jan 17, 2018

Let me clarify: we need to see the raw bytes output from the memory.toHex() method. it should be about 4K bytes.

@szunami
Copy link

szunami commented Jan 17, 2018

Sure thing, hex data below:

Data

Data, littleendian  :  0  1  2  3  4  5  6  7
                   0: 02 03 08 08 80 00 00 00 
                   8: 25 a2 13 00 00 00 00 00 
                  16: 09 01 00 00 00 09 01 00 
                  24: 00 00 09 01 00 00 00 09 
                  32: 01 00 00 00 09 01 00 00 
                  40: 00 09 01 00 00 00 09 01 
                  48: 00 00 00 09 01 00 00 00 
                  56: 09 01 00 00 00 09 01 00 
                  64: 00 00 09 01 00 00 00 09 
                  72: 01 00 00 00 09 01 00 00 
                  80: 00 09 01 00 00 00 09 01 
                  88: 00 00 00 09 01 00 00 00 
                  96: 09 01 00 00 00 09 01 00 
                 104: 00 00 09 01 00 00 00 09 
                 112: 01 00 00 00 09 01 00 00 
                 120: 00 09 01 00 00 00 09 01 
                 128: 00 00 00 09 01 00 00 00 
                 136: 09 01 00 00 00 09 01 00 
                 144: 00 00 09 01 00 00 00 09 
                 152: 01 00 00 00 09 01 00 00 
                 160: 00 09 01 00 00 00 09 01 
                 168: 00 00 00 09 01 00 00 00 
                 176: 09 01 00 00 00 09 01 00 
                 184: 00 00 09 01 00 00 00 09 
                 192: 01 00 00 00 09 01 00 00 
                 200: 00 09 01 00 00 00 09 01 
                 208: 00 00 00 09 01 00 00 00 
                 216: 09 01 00 00 00 09 01 00 
                 224: 00 00 09 01 00 00 00 09 
                 232: 01 00 00 00 09 01 00 00 
                 240: 00 09 01 00 00 00 09 01 
                 248: 00 00 00 09 01 00 00 00 
                 256: 09 01 00 00 00 09 01 00 
                 264: 00 00 09 01 00 00 00 09 
                 272: 01 00 00 00 09 01 00 00 
                 280: 00 09 01 00 00 00 09 01 
                 288: 00 00 00 09 01 00 00 00 
                 296: 09 01 00 00 00 09 01 00 
                 304: 00 00 09 01 00 00 00 09 
                 312: 01 00 00 00 09 01 00 00 
                 320: 00 09 01 00 00 00 09 01 
                 328: 00 00 00 09 01 00 00 00 
                 336: 09 01 00 00 00 09 01 00 
                 344: 00 00 09 01 00 00 00 09 
                 352: 01 00 00 00 09 01 00 00 
                 360: 00 09 01 00 00 00 09 01 
                 368: 00 00 00 09 01 00 00 00 
                 376: 09 01 00 00 00 09 01 00 
                 384: 00 00 09 01 00 00 00 09 
                 392: 01 00 00 00 09 01 00 00 
                 400: 00 09 01 00 00 00 09 01 
                 408: 00 00 00 09 01 00 00 00 
                 416: 09 01 00 00 00 09 01 00 
                 424: 00 00 09 01 00 00 00 09 
                 432: 01 00 00 00 09 01 00 00 
                 440: 00 09 01 00 00 00 09 01 
                 448: 00 00 00 09 01 00 00 00 
                 456: 09 01 00 00 00 09 01 00 
                 464: 00 00 09 01 00 00 00 09 
                 472: 01 00 00 00 09 01 00 00 
                 480: 00 09 01 00 00 00 09 01 
                 488: 00 00 00 09 01 00 00 00 
                 496: 09 01 00 00 00 09 01 00 
                 504: 00 00 09 01 00 00 00 09 
                 512: 01 00 00 00 09 01 00 00 
                 520: 00 09 01 00 00 00 09 01 
                 528: 00 00 00 09 01 00 00 00 
                 536: 09 01 00 00 00 09 01 00 
                 544: 00 00 09 01 00 00 00 09 
                 552: 01 00 00 00 09 01 00 00 
                 560: 00 09 01 00 00 00 09 01 
                 568: 00 00 00 09 01 00 00 00 
                 576: 09 01 00 00 00 09 01 00 
                 584: 00 00 09 01 00 00 00 09 
                 592: 01 00 00 00 09 01 00 00 
                 600: 00 09 01 00 00 00 09 01 
                 608: 00 00 00 09 01 00 00 00 
                 616: 09 01 00 00 00 09 01 00 
                 624: 00 00 09 01 00 00 00 09 
                 632: 01 00 00 00 09 01 00 00 
                 640: 00 09 01 00 00 00 09 01 
                 648: 00 00 00 09 01 00 00 00 
                 656: 09 01 00 00 00 09 01 00 
                 664: 00 00 09 01 00 00 00 09 
                 672: 01 00 00 00 09 01 00 00 
                 680: 00 09 01 00 00 00 09 01 
                 688: 00 00 00 09 01 00 00 00 
                 696: 09 01 00 00 00 09 01 00 
                 704: 00 00 09 01 00 00 00 09 
                 712: 01 00 00 00 09 01 00 00 
                 720: 00 09 01 00 00 00 09 01 
                 728: 00 00 00 09 01 00 00 00 
                 736: 09 01 00 00 00 09 01 00 
                 744: 00 00 09 01 00 00 00 09 
                 752: 01 00 00 00 09 01 00 00 
                 760: 00 09 01 00 00 00 09 01 
                 768: 00 00 00 09 01 00 00 00 
                 776: 09 01 00 00 00 09 01 00 
                 784: 00 00 09 01 00 00 00 09 
                 792: 01 00 00 00 09 01 00 00 
                 800: 00 09 01 00 00 00 09 01 
                 808: 00 00 00 09 01 00 00 00 
                 816: 09 01 00 00 00 09 01 00 
                 824: 00 00 09 01 00 00 00 09 
                 832: 01 00 00 00 09 01 00 00 
                 840: 00 09 01 00 00 00 09 01 
                 848: 00 00 00 09 01 00 00 00 
                 856: 09 01 00 00 00 09 01 00 
                 864: 00 00 09 01 00 00 00 09 
                 872: 01 00 00 00 09 01 00 00 
                 880: 00 09 01 00 00 00 09 01 
                 888: 00 00 00 09 01 00 00 00 
                 896: 09 01 00 00 00 09 01 00 
                 904: 00 00 09 01 00 00 00 09 
                 912: 01 00 00 00 09 01 00 00 
                 920: 00 09 01 00 00 00 09 01 
                 928: 00 00 00 09 01 00 00 00 
                 936: 09 01 00 00 00 09 01 00 
                 944: 00 00 09 01 00 00 00 09 
                 952: 01 00 00 00 09 01 00 00 
                 960: 00 09 01 00 00 00 09 01 
                 968: 00 00 00 09 01 00 00 00 
                 976: 09 01 00 00 00 09 01 00 
                 984: 00 00 09 01 00 00 00 09 
                 992: 01 00 00 00 09 01 00 00 
                1000: 00 09 01 00 00 00 09 01 
                1008: 00 00 00 09 01 00 00 00 
                1016: 09 01 00 00 00 09 01 00 
                1024: 00 00 09 01 00 00 00 09 
                1032: 01 00 00 00 09 01 00 00 
                1040: 00 09 01 00 00 00 09 01 
                1048: 00 00 00 09 01 00 00 00 
                1056: 09 01 00 00 00 09 01 00 
                1064: 00 00 09 01 00 00 00 09 
                1072: 01 00 00 00 09 01 00 00 
                1080: 00 09 01 00 00 00 09 01 
                1088: 00 00 00 09 01 00 00 00 
                1096: 09 01 00 00 00 09 01 00 
                1104: 00 00 09 01 00 00 00 09 
                1112: 01 00 00 00 09 01 00 00 
                1120: 00 09 01 00 00 00 09 01 
                1128: 00 00 00 09 01 00 00 00 
                1136: 09 01 00 00 00 09 01 00 
                1144: 00 00 09 01 00 00 00 09 
                1152: 01 00 00 00 09 01 00 00 
                1160: 00 09 01 00 00 00 09 01 
                1168: 00 00 00 09 01 00 00 00 
                1176: 09 01 00 00 00 09 01 00 
                1184: 00 00 09 01 00 00 00 09 
                1192: 01 00 00 00 09 01 00 00 
                1200: 00 09 01 00 00 00 09 01 
                1208: 00 00 00 09 01 00 00 00 
                1216: 09 01 00 00 00 09 01 00 
                1224: 00 00 09 01 00 00 00 09 
                1232: 01 00 00 00 09 01 00 00 
                1240: 00 09 01 00 00 00 09 01 
                1248: 00 00 00 09 01 00 00 00 
                1256: 09 01 00 00 00 09 01 00 
                1264: 00 00 09 01 00 00 00 09 
                1272: 01 00 00 00 09 01 00 00 
                1280: 00 09 01 00 00 00 09 01 
                1288: 00 00 00 09 01 00 00 00 
                1296: 09 01 00 00 00 09 01 00 
                1304: 00 00 09 01 00 00 00 09 
                1312: 01 00 00 00 09 01 00 00 
                1320: 00 09 01 00 00 00 09 01 
                1328: 00 00 00 09 01 00 00 00 
                1336: 09 01 00 00 00 09 01 00 
                1344: 00 00 09 01 00 00 00 09 
                1352: 01 00 00 00 09 01 00 00 
                1360: 00 09 01 00 00 00 09 01 
                1368: 00 00 00 09 01 00 00 00 
                1376: 09 01 00 00 00 09 01 00 
                1384: 00 00 09 01 00 00 00 09 
                1392: 01 00 00 00 09 01 00 00 
                1400: 00 09 01 00 00 00 09 01 
                1408: 00 00 00 09 01 00 00 00 
                1416: 09 01 00 00 00 09 01 00 
                1424: 00 00 09 01 00 00 00 09 
                1432: 01 00 00 00 09 01 00 00 
                1440: 00 09 01 00 00 00 09 01 
                1448: 00 00 00 09 01 00 00 00 
                1456: 09 01 00 00 00 09 01 00 
                1464: 00 00 09 01 00 00 00 09 
                1472: 01 00 00 00 09 01 00 00 
                1480: 00 09 01 00 00 00 09 01 
                1488: 00 00 00 09 01 00 00 00 
                1496: 09 01 00 00 00 09 01 00 
                1504: 00 00 09 01 00 00 00 09 
                1512: 01 00 00 00 09 01 00 00 
                1520: 00 09 01 00 00 00 09 01 
                1528: 00 00 00 09 01 00 00 00 
                1536: 09 01 00 00 00 09 01 00 
                1544: 00 00 09 01 00 00 00 09 
                1552: 01 00 00 00 09 01 00 00 
                1560: 00 09 01 00 00 00 09 01 
                1568: 00 00 00 09 01 00 00 00 
                1576: 09 01 00 00 00 09 01 00 
                1584: 00 00 09 01 00 00 00 09 
                1592: 01 00 00 00 09 01 00 00 
                1600: 00 09 01 00 00 00 09 01 
                1608: 00 00 00 09 01 00 00 00 
                1616: 09 01 00 00 00 09 01 00 
                1624: 00 00 09 01 00 00 00 09 
                1632: 01 00 00 00 09 01 00 00 
                1640: 00 09 01 00 00 00 09 01 
                1648: 00 00 00 09 01 00 00 00 
                1656: 09 01 00 00 00 09 01 00 
                1664: 00 00 09 01 00 00 00 09 
                1672: 01 00 00 00 09 01 00 00 
                1680: 00 09 01 00 00 00 09 01 
                1688: 00 00 00 09 01 00 00 00 
                1696: 09 01 00 00 00 09 01 00 
                1704: 00 00 09 01 00 00 00 09 
                1712: 01 00 00 00 09 01 00 00 
                1720: 00 09 01 00 00 00 09 01 
                1728: 00 00 00 09 01 00 00 00 
                1736: 09 01 00 00 00 09 01 00 
                1744: 00 00 09 01 00 00 00 09 
                1752: 01 00 00 00 09 01 00 00 
                1760: 00 09 01 00 00 00 09 01 
                1768: 00 00 00 09 01 00 00 00 
                1776: 09 01 00 00 00 09 01 00 
                1784: 00 00 09 01 00 00 00 09 
                1792: 01 00 00 00 09 01 00 00 
                1800: 00 09 01 00 00 00 09 01 
                1808: 00 00 00 09 01 00 00 00 
                1816: 09 01 00 00 00 09 01 00 
                1824: 00 00 09 01 00 00 00 09 
                1832: 01 00 00 00 09 01 00 00 
                1840: 00 09 01 00 00 00 09 01 
                1848: 00 00 00 09 01 00 00 00 
                1856: 09 01 00 00 00 09 01 00 
                1864: 00 00 09 01 00 00 00 09 
                1872: 01 00 00 00 09 01 00 00 
                1880: 00 09 01 00 00 00 09 01 
                1888: 00 00 00 09 01 00 00 00 
                1896: 09 01 00 00 00 09 01 00 
                1904: 00 00 09 01 00 00 00 09 
                1912: 01 00 00 00 09 01 00 00 
                1920: 00 09 01 00 00 00 09 01 
                1928: 00 00 00 09 01 00 00 00 
                1936: 09 01 00 00 00 09 01 00 
                1944: 00 00 09 01 00 00 00 09 
                1952: 01 00 00 00 09 01 00 00 
                1960: 00 09 01 00 00 00 09 01 
                1968: 00 00 00 09 01 00 00 00 
                1976: 09 01 00 00 00 09 01 00 
                1984: 00 00 09 01 00 00 00 09 
                1992: 01 00 00 00 09 01 00 00 
                2000: 00 09 01 00 00 00 09 01 
                2008: 00 00 00 09 01 00 00 00 
                2016: 09 01 00 00 00 09 01 00 
                2024: 00 00 09 01 00 00 00 09 
                2032: 01 00 00 00 09 01 00 00 
                2040: 00 09 01 00 00 00 09 01 
                2048: 00 00 00 09 01 00 00 00 
                2056: 09 01 00 00 00 09 01 00 
                2064: 00 00 09 01 00 00 00 09 
                2072: 01 00 00 00 09 01 00 00 
                2080: 00 09 01 00 00 00 09 01 
                2088: 00 00 00 09 01 00 00 00 
                2096: 09 01 00 00 00 09 01 00 
                2104: 00 00 09 01 00 00 00 09 
                2112: 01 00 00 00 09 01 00 00 
                2120: 00 09 01 00 00 00 09 01 
                2128: 00 00 00 09 01 00 00 00 
                2136: 09 01 00 00 00 09 01 00 
                2144: 00 00 09 01 00 00 00 09 
                2152: 01 00 00 00 09 01 00 00 
                2160: 00 09 01 00 00 00 09 01 
                2168: 00 00 00 09 01 00 00 00 
                2176: 09 01 00 00 00 09 01 00 
                2184: 00 00 09 01 00 00 00 09 
                2192: 01 00 00 00 09 01 00 00 
                2200: 00 09 01 00 00 00 09 01 
                2208: 00 00 00 09 01 00 00 00 
                2216: 09 01 00 00 00 09 01 00 
                2224: 00 00 09 01 00 00 00 09 
                2232: 01 00 00 00 09 01 00 00 
                2240: 00 09 01 00 00 00 09 01 
                2248: 00 00 00 09 01 00 00 00 
                2256: 09 01 00 00 00 09 01 00 
                2264: 00 00 09 01 00 00 00 09 
                2272: 01 00 00 00 09 01 00 00 
                2280: 00 09 01 00 00 00 09 01 
                2288: 00 00 00 09 01 00 00 00 
                2296: 09 01 00 00 00 09 01 00 
                2304: 00 00 09 01 00 00 00 09 
                2312: 01 00 00 00 09 01 00 00 
                2320: 00 09 01 00 00 00 09 01 
                2328: 00 00 00 09 01 00 00 00 
                2336: 09 01 00 00 00 09 01 00 
                2344: 00 00 09 01 00 00 00 09 
                2352: 01 00 00 00 09 01 00 00 
                2360: 00 09 01 00 00 00 09 01 
                2368: 00 00 00 09 01 00 00 00 
                2376: 09 01 00 00 00 09 01 00 
                2384: 00 00 09 01 00 00 00 09 
                2392: 01 00 00 00 09 01 00 00 
                2400: 00 09 01 00 00 00 09 01 
                2408: 00 00 00 09 01 00 00 00 
                2416: 09 01 00 00 00 09 01 00 
                2424: 00 00 09 01 00 00 00 09 
                2432: 01 00 00 00 09 01 00 00 
                2440: 00 09 01 00 00 00 09 01 
                2448: 00 00 00 09 01 00 00 00 
                2456: 09 01 00 00 00 09 01 00 
                2464: 00 00 09 01 00 00 00 09 
                2472: 01 00 00 00 09 01 00 00 
                2480: 00 09 01 00 00 00 09 01 
                2488: 00 00 00 09 01 00 00 00 
                2496: 09 01 00 00 00 09 01 00 
                2504: 00 00 09 01 00 00 00 09 
                2512: 01 00 00 00 09 01 00 00 
                2520: 00 09 01 00 00 00 09 01 
                2528: 00 00 00 09 01 00 00 00 
                2536: 09 01 00 00 00 09 01 00 
                2544: 00 00 09 01 00 00 00 09 
                2552: 01 00 00 00 09 01 00 00 
                2560: 00 09 01 00 00 00 09 01 
                2568: 00 00 00 09 01 00 00 00 
                2576: 09 01 00 00 00 09 01 00 
                2584: 00 00 09 01 00 00 00 09 
                2592: 01 00 00 00 09 01 00 00 
                2600: 00 09 01 00 00 00 09 01 
                2608: 00 00 00 09 01 00 00 00 
                2616: 09 01 00 00 00 09 01 00 
                2624: 00 00 09 01 00 00 00 09 
                2632: 01 00 00 00 09 01 00 00 
                2640: 00 09 01 00 00 00 09 01 
                2648: 00 00 00 09 01 00 00 00 
                2656: 09 01 00 00 00 09 01 00 
                2664: 00 00 09 01 00 00 00 09 
                2672: 01 00 00 00 09 01 00 00 
                2680: 00 09 01 00 00 00 09 01 
                2688: 00 00 00 09 01 00 00 00 
                2696: 09 01 00 00 00 09 01 00 
                2704: 00 00 09 01 00 00 00 09 
                2712: 01 00 00 00 09 01 00 00 
                2720: 00 09 01 00 00 00 09 01 
                2728: 00 00 00 09 01 00 00 00 
                2736: 09 01 00 00 00 09 01 00 
                2744: 00 00 09 01 00 00 00 09 
                2752: 01 00 00 00 09 01 00 00 
                2760: 00 09 01 00 00 00 09 01 
                2768: 00 00 00 09 01 00 00 00 
                2776: 09 01 00 00 00 09 01 00 
                2784: 00 00 09 01 00 00 00 09 
                2792: 01 00 00 00 09 01 00 00 
                2800: 00 09 01 00 00 00 09 01 
                2808: 00 00 00 09 01 00 00 00 
                2816: 09 01 00 00 00 09 01 00 
                2824: 00 00 09 01 00 00 00 09 
                2832: 01 00 00 00 09 01 00 00 
                2840: 00 09 01 00 00 00 09 01 
                2848: 00 00 00 09 01 00 00 00 
                2856: 09 01 00 00 00 09 01 00 
                2864: 00 00 09 01 00 00 00 09 
                2872: 01 00 00 00 09 01 00 00 
                2880: 00 09 01 00 00 00 09 01 
                2888: 00 00 00 09 01 00 00 00 
                2896: 09 01 00 00 00 09 01 00 
                2904: 00 00 09 01 00 00 00 09 
                2912: 01 00 00 00 09 01 00 00 
                2920: 00 09 01 00 00 00 09 01 
                2928: 00 00 00 09 01 00 00 00 
                2936: 09 01 00 00 00 09 01 00 
                2944: 00 00 09 01 00 00 00 09 
                2952: 01 00 00 00 09 01 00 00 
                2960: 00 09 01 00 00 00 09 01 
                2968: 00 00 00 09 01 00 00 00 
                2976: 09 01 00 00 00 09 01 00 
                2984: 00 00 09 01 00 00 00 09 
                2992: 01 00 00 00 09 01 00 00 
                3000: 00 09 01 00 00 00 09 01 
                3008: 00 00 00 09 01 00 00 00 
                3016: 09 01 00 00 00 09 01 00 
                3024: 00 00 09 01 00 00 00 09 
                3032: 01 00 00 00 09 01 00 00 
                3040: 00 09 01 00 00 00 09 01 
                3048: 00 00 00 09 01 00 00 00 
                3056: 09 01 00 00 00 09 01 00 
                3064: 00 00 09 01 00 00 00 09 
                3072: 01 00 00 00 09 01 00 00 
                3080: 00 09 01 00 00 00 09 01 
                3088: 00 00 00 09 01 00 00 00 
                3096: 09 01 00 00 00 09 01 00 
                3104: 00 00 09 01 00 00 00 09 
                3112: 01 00 00 00 09 01 00 00 
                3120: 00 09 01 00 00 00 09 01 
                3128: 00 00 00 09 01 00 00 00 
                3136: 09 01 00 00 00 09 01 00 
                3144: 00 00 09 01 00 00 00 09 
                3152: 01 00 00 00 09 01 00 00 
                3160: 00 09 01 00 00 00 09 01 
                3168: 00 00 00 09 01 00 00 00 
                3176: 09 01 00 00 00 09 01 00 
                3184: 00 00 09 01 00 00 00 09 
                3192: 01 00 00 00 09 01 00 00 
                3200: 00 09 01 00 00 00 09 01 
                3208: 00 00 00 09 01 00 00 00 
                3216: 09 01 00 00 00 09 01 00 
                3224: 00 00 09 01 00 00 00 09 
                3232: 01 00 00 00 09 01 00 00 
                3240: 00 09 01 00 00 00 09 01 
                3248: 00 00 00 09 01 00 00 00 
                3256: 09 01 00 00 00 09 01 00 
                3264: 00 00 09 01 00 00 00 09 
                3272: 01 00 00 00 09 01 00 00 
                3280: 00 09 01 00 00 00 09 01 
                3288: 00 00 00 09 01 00 00 00 
                3296: 09 01 00 00 00 09 01 00 
                3304: 00 00 09 01 00 00 00 09 
                3312: 01 00 00 00 09 01 00 00 
                3320: 00 09 01 00 00 00 09 01 
                3328: 00 00 00 09 01 00 00 00 
                3336: 09 01 00 00 00 09 01 00 
                3344: 00 00 09 01 00 00 00 09 
                3352: 01 00 00 00 09 01 00 00 
                3360: 00 09 01 00 00 00 09 01 
                3368: 00 00 00 09 01 00 00 00 
                3376: 09 01 00 00 00 09 01 00 
                3384: 00 00 09 01 00 00 00 09 
                3392: 01 00 00 00 09 01 00 00 
                3400: 00 09 01 00 00 00 09 01 
                3408: 00 00 00 09 01 00 00 00 
                3416: 09 01 00 00 00 09 01 00 
                3424: 00 00 09 01 00 00 00 09 
                3432: 01 00 00 00 09 01 00 00 
                3440: 00 09 01 00 00 00 09 01 
                3448: 00 00 00 09 01 00 00 00 
                3456: 09 01 00 00 00 09 01 00 
                3464: 00 00 09 01 00 00 00 09 
                3472: 01 00 00 00 09 01 00 00 
                3480: 00 09 01 00 00 00 09 01 
                3488: 00 00 00 09 01 00 00 00 
                3496: 09 01 00 00 00 09 01 00 
                3504: 00 00 09 01 00 00 00 09 
                3512: 01 00 00 00 09 01 00 00 
                3520: 00 09 01 00 00 00 09 01 
                3528: 00 00 00 09 01 00 00 00 
                3536: 09 01 00 00 00 09 01 00 
                3544: 00 00 09 01 00 00 00 09 
                3552: 01 00 00 00 09 01 00 00 
                3560: 00 09 01 00 00 00 09 01 
                3568: 00 00 00 09 01 00 00 00 
                3576: 09 01 00 00 00 09 01 00 
                3584: 00 00 09 01 00 00 00 09 
                3592: 01 00 00 00 09 01 00 00 
                3600: 00 09 01 00 00 00 09 01 
                3608: 00 00 00 09 01 00 00 00 
                3616: 09 01 00 00 00 09 01 00 
                3624: 00 00 09 01 00 00 00 09 
                3632: 01 00 00 00 09 01 00 00 
                3640: 00 09 01 00 00 00 09 01 
                3648: 00 00 00 09 01 00 00 00 
                3656: 09 01 00 00 00 09 01 00 
                3664: 00 00 09 01 00 00 00 09 
                3672: 01 00 00 00 09 01 00 00 
                3680: 00 09 01 00 00 00 09 01 
                3688: 00 00 00 09 01 00 00 00 
                3696: 09 01 00 00 00 09 01 00 
                3704: 00 00 09 01 00 00 00 09 
                3712: 01 00 00 00 09 01 00 00 
                3720: 00 09 01 00 00 00 09 01 
                3728: 00 00 00 09 01 00 00 00 
                3736: 09 01 00 00 00 09 01 00 
                3744: 00 00 09 01 00 00 00 09 
                3752: 01 00 00 00 09 01 00 00 
                3760: 00 09 01 00 00 00 09 01 
                3768: 00 00 00 09 01 00 00 00 
                3776: 09 01 00 00 00 09 01 00 
                3784: 00 00 09 01 00 00 00 09 
                3792: 01 00 00 00 09 01 00 00 
                3800: 00 09 01 00 00 00 09 01 
                3808: 00 00 00 09 01 00 00 00 
                3816: 09 01 00 00 00 09 01 00 
                3824: 00 00 09 01 00 00 00 09 
                3832: 01 00 00 00 09 01 00 00 
                3840: 00 09 01 00 00 00 09 01 
                3848: 00 00 00 09 01 00 00 00 
                3856: 09 01 00 00 00 09 01 00 
                3864: 00 00 09 01 00 00 00 09 
                3872: 01 00 00 00 09 01 00 00 
                3880: 00 09 01 00 00 00 09 01 
                3888: 00 00 00 09 01 00 00 00 
                3896: 09 01 00 00 00 09 01 00 
                3904: 00 00 09 01 00 00 00 09 
                3912: 01 00 00 00 09 01 00 00 
                3920: 00 09 01 00 00 00 09 01 
                3928: 00 00 00 09 01 00 00 00 
                3936: 09 01 00 00 00 09 01 00 
                3944: 00 00 09 01 00 00 00 09 
                3952: 01 00 00 00 09 01 00 00 
                3960: 00 09 01 00 00 00 09 01 
                3968: 00 00 00 09 01 00 00 00 
                3976: 09 01 00 00 00 09 01 00 
                3984: 00 00 09 01 00 00 00 09 
                3992: 01 00 00 00 09 01 00 00 
                4000: 00 09 01 00 00 00 09 01 
                4008: 00 00 00 09 01 00 00 00 
                4016: 09 01 00 00 00 09 01 00 
                4024: 00 00 09 01 00 00 00 09 
                4032: 01 00 00 00 09 01 00 00 
                4040: 00 09 01 00 00 00 ~~~
</p></details>

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 17, 2018 via email

@hpx7
Copy link
Author

hpx7 commented Jan 17, 2018

Not sure if relevant, but I believe the input was doubles.

@hpx7
Copy link
Author

hpx7 commented Jan 17, 2018

Let us know if it would be helpful to have more memory dumps as well (we have a few more).

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 17, 2018 via email

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 17, 2018 via email

@hpx7
Copy link
Author

hpx7 commented Jan 17, 2018

I see, this must have been a dump for an integer column then.

I suppose the reproduction is complicated by the fact that we do other operations besides update. For example, we also merge and serialize/deserialize sketches during the process of computing them (ultimately driven by spark).

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 17, 2018 via email

@leerho
Copy link
Contributor

leerho commented Jan 18, 2018

None of the sketches in the library are multi-threaded. If you have concurrent threads reading and writing to the same sketch you must make your sketch wrapper synchronized.

@hpx7
Copy link
Author

hpx7 commented Jan 18, 2018

We don't have concurrent threads reading and writing to the same sketch (Spark parallelizes by splitting the data across machines -- within a given process we iterate over the data sequentially).

@AlexanderSaydakov I've updated #178 (comment) to include our update+merge methods.

When I say sporadic failures, I meant that the computation fails on certain datasets and not others. For a given dataset, the failure/nonfailure is consistent on retries. However, it's been difficult to reproduce locally even if I download the problematic dataset since I don't have a clustered spark setup on my local machine.

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 18, 2018 via email

@hpx7
Copy link
Author

hpx7 commented Jan 18, 2018

Fair enough, we can synchronize all the methods just to rule this out.

@AlexanderSaydakov
Copy link
Contributor

AlexanderSaydakov commented Jan 22, 2018 via email

@szunami
Copy link

szunami commented Jan 29, 2018

@AlexanderSaydakov after digging in to the Spark architecture a bit, our non-threadsafe code does seem like a likely root cause here. We are in the process of testing a threadsafe version, but as I'm sure you are aware these sort of race-conditions can be hard to formally root out.

Thanks so much for your insight and analysis, I suspect by the end of this week we ought to have a more conclusive picture of whether this change fixed the issue or not.

@leerho
Copy link
Contributor

leerho commented Feb 9, 2018

@hpx7 @szunami

We haven't heard from you. Did synchronizing your sketch wrapper fix this issue? I would like to close this out. Thanks!

@szunami
Copy link

szunami commented Feb 14, 2018

After running the synchronized code for a couple weeks, the issue has not resurfaced. Thanks as always for the help!

@leerho
Copy link
Contributor

leerho commented Feb 14, 2018

Thank you for getting back to us. I think this thread will be valuable reading for a number of folks.

@leerho leerho closed this as completed Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants