Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-3446] Fixes RedisIO non-prefix read operations #4656

Closed
wants to merge 1 commit into from

Conversation

vvarma
Copy link
Contributor

@vvarma vvarma commented Feb 11, 2018

  • BaseReadFn to abstract general jedis operations.
  • Moved key fetch given prefix to ReadKeywsWithPattern DoFn.
  • ReadFn is pure fetch from redis given key.

URL: https://issues.apache.org/jira/browse/BEAM-3446

BaseReadFn to abstract general jedis operations. Separated key fetch using prefix and get by key into serparate DoFn.
@stale
Copy link

stale bot commented Jun 7, 2018

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@stale stale bot added the wontfix label Jun 7, 2018
@vvarma
Copy link
Contributor Author

vvarma commented Jun 8, 2018

the fix is in place @jbonofre please help review

@stale stale bot removed the wontfix label Jun 8, 2018
@kennknowles
Copy link
Member

We have turned on autoformatting of the codebase, which causes small conflicts across the board. You can probably safely rebase and just keep your changes. Like this:

$ git rebase
... see some conflicts
$ git diff
... confirmed that the conflicts are just autoformatting
... so we can just keep our changes are do our own autoformat
$ git checkout --theirs --
$ git add -u
$ git rebase --continue
$ ./gradlew spotlessJavaApply

Please ping me if you run into any difficulty.

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for taking so long @vvarma. Can you please rebase this one and take a look at my comments. (@jbonofre sorry for jumping in but was taking a look at broken things because of the recent spotless enable and thought about helping with this one).

@@ -200,6 +200,7 @@ public void populateDisplayData(DisplayData.Builder builder) {

return input
.apply(Create.of(keyPattern()))
.apply(ParDo.of(new ReadKeywsWithPattern(connectionConfiguration())))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ReadKeywsWithPattern/ReadKeysWithPattern


private final RedisConnectionConfiguration connectionConfiguration;
private abstract static class BaseReadFn<T> extends DoFn<String, T> {
protected final RedisConnectionConfiguration connectionConfiguration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be package private


private transient Jedis jedis;
protected transient Jedis jedis;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be package private


public ReadFn(RedisConnectionConfiguration connectionConfiguration) {
public BaseReadFn(RedisConnectionConfiguration connectionConfiguration) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove public, in general it is a common Beam practice to restrict access as much as possible. You can use IntelliJ's analyze code to do this.

@@ -288,28 +297,33 @@ public void processElement(ProcessContext processContext) throws Exception {
while (!finished) {
ScanResult<String> scanResult = jedis.scan(cursor, scanParams);
List<String> keys = scanResult.getResult();

Pipeline pipeline = jedis.pipelined();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What is the reason to remove pipelining in general, seems like if the approach of this PR is more composable, it would perform worse, won't it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of keys that need to be looked up in a given window or batch can vary. Ideally, we should have a configurable batch size and use MGET https://redis.io/commands/mget if wanted to optimize further.
Pipelining an entire window or batch can cause memory spikes in Redis depending on the number of keys being looked up, for the time being, to simplify things I removed pipeline.

@vvarma
Copy link
Contributor Author

vvarma commented Jun 30, 2018

Closing this pr, opened a new one with fixes and rebased. #5841

@vvarma vvarma closed this Jun 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants