Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat Records #375

Merged
merged 5 commits into from Jan 14, 2019

Conversation

@dkoszewnik
Copy link
Contributor

commented Jan 14, 2019

FlatRecords provide the ability to serialize records for addition to a remote state engine.

This is experimental functionality for now -- the API and serialized representations are both subject to backwards incompatible changes without notice.

@dkoszewnik dkoszewnik requested review from PaulSandoz and toolbear Jan 14, 2019


public class HollowWriteFieldUtils {

public static int readIntBits(ByteData data, long fieldPosition) {

This comment has been minimized.

Copy link
@PaulSandoz

PaulSandoz Jan 14, 2019

Contributor

These two methods could be placed as default methods on ByteData, which then allows for more optimal implementations by overriding implementations (if present)

if(entry.getValue() != DELETE_RECORD)
newState.add(entry.getValue());
executor.execute(() -> {
FlatRecordDumper flatRecordDumper = null;

This comment has been minimized.

Copy link
@PaulSandoz

PaulSandoz Jan 14, 2019

Contributor

There is no parallelism since only one task is executed asynchronously and the code that follows waits for that task in the executor to complete. Was that the intention?
My suggestion would be to keep this simple, snapshot the mutations (values()) into a list, and operate sequentially within the same thread. We can easily change this later to a stream and operate in parallel getting good decomposition. Or alternatively split the list into N parts in proportion to the number of hardware cores you want to use and execute N times (but parallel streams can do this more easily and better).
The guide of "threadsPerCpu" to control the parallelism is likely to be wrong most of the time. We need to express something that takes into account the machine parallelism (number of hardware cores to use), the amount of work (number of mutations), and the unit cost per work (the cost for each mutation). This is not easy to get right, hence why there is currently no mechanism in parallel streams :-) however we can control the former (machine parallelism) by executing the parallel stream within a custom fork join pool (to avoid using the common fork join pool).

This comment has been minimized.

Copy link
@dkoszewnik

dkoszewnik Jan 14, 2019

Author Contributor

You're right, that was not the intention. I have corrected it.

@rpalcolea originally added parallelism here when he pulled out the HollowIncrementalCyclePopulator, perhaps in response to a bottleneck he encountered in practice, so I'll retain that for now.

@PaulSandoz
Copy link
Contributor

left a comment

Nice! I wrote some comments but i think we can iterate in master given the experimental nature.

@toolbear
Copy link
Contributor

left a comment

Instead of adapting the existing incremental producer, should we instead build up a new replacement API? That way we can deprecate the current API as-is without introducing any changes to it. Also, I wonder if there are features in the current API that we'd like to abandon. Or if it's just easier to iterate on the design without being encumbered by choices in the current implementation.

If we want to go the route of introducing a new HollowIncrementalProducer and keep some of the same names, we could create the FlatRecord based implementation under an incubating package name.

I don't know if this lines up with the changes Paul had in mind. If the new API is far more similar to the current API then maybe it's a bad idea to fork the classes. I want to be able to release without having to disrupt Target or anyone else using the current API until we have an incremental producer that's a mature replacement for the "beta" one we already released.

Alternatively, we could do as we warned we might and make any breaking changes we need, suggesting that people fork the older API if they need it or await a mature release of the new API.

@@ -0,0 +1,109 @@
package com.netflix.hollow.core.write.objectmapper.flatrecords;

This comment has been minimized.

Copy link
@toolbear

toolbear Jan 14, 2019

Contributor

Per the incubating/forking theme, this could be com.netflix.hollow.core.write.objectmapper.flatrecords.incubating or com.netflix.hollow.incubating.core.write.objectmapper.flatrecords. Or replacing incubating with experimental.

import com.netflix.hollow.core.write.objectmapper.RecordPrimaryKey;

/**
* Warning: Experimental. This is a BETA API and is subject to breaking changes.

This comment has been minimized.

Copy link
@toolbear

toolbear Jan 14, 2019

Contributor

Maybe:

Warning: Experimental. This API is subject to breaking changes.

Or if we go with the incubating nomenclature:

Warning: Incubating. This API is subject to breaking changes.

I picked up the "Incubating" nomenclature from a Netflix OSS meeting topic on being more explicit about project maturity. Anyhow...

I don't have a strong opinion on the word we use except that I think we should move away from using "beta". Ironically as it seems, I think "beta" has come to mean "almost mature" whereas experimental or incubating does a better job of indicating that the maturity of this feature.

This comment has been minimized.

Copy link
@dkoszewnik

dkoszewnik Jan 14, 2019

Author Contributor

👍 Improved the admonitions in the master branch after merge.

if(entry.getValue() != DELETE_RECORD)
newState.add(entry.getValue());
executor.execute(() -> {
FlatRecordDumper flatRecordDumper = null;

This comment has been minimized.

Copy link
@toolbear

toolbear Jan 14, 2019

Contributor

Per the incubating/forking theme, this gets simpler if all we're supporting (initially) is a FlatRecord based incremental producer.

Again, maybe Paul already has an idea for how to generalize this that makes my suggestion moot.

return assignedOrdinal;
}

@Override

This comment has been minimized.

Copy link
@toolbear

toolbear Jan 14, 2019

Contributor

Add the admonition?

    /**
     * Warning: Experimental.  This is a BETA API and is subject to breaking changes.
     */

@dkoszewnik dkoszewnik merged commit 615201e into master Jan 14, 2019

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@dkoszewnik dkoszewnik deleted the flat-records branch Jan 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.