Skip to content
This repository has been archived by the owner on Jun 7, 2021. It is now read-only.

[APEX-209] Refactored monolithic deployInputStream function. #129

Closed
wants to merge 3 commits into from
Closed

[APEX-209] Refactored monolithic deployInputStream function. #129

wants to merge 3 commits into from

Conversation

ilganeli
Copy link
Contributor

In an effort to better understand the flow of data within Apex, I have refactored the monolithic (200+ line long) deployInputStreams function to modularize it, add comments, and clarify operational flow.
This patch also includes formatting fixes per CodeStyle/CheckStyle guidelines for this function.

I would very much appreciate a review to confirm that method and variable names make sense and that comments appropriately capture what's going on. There has not been any modification of code (with the exception of duplicate instantiation of streamCodecIdentifier, streamCodec, sinkIdentifer, and sourceIdentifier in connectInputToBufferServer and connectInputToLocalSink. This was done for the sake of more concise method declaration. All other changes only involved method extraction.

@243826
Copy link
Contributor

243826 commented Oct 29, 2015

Ilya, is splitting a method declaration part of our coding guidelines?

@ilganeli
Copy link
Contributor Author

@243826 - this wasn't done explicitly based on the coding guidelines but was done in an effort to appeal to common best practices for coding. As an outsider, seeing this code base for the first time, it was difficult to make sense of it. Given the importance of building community and driving involvement from outside sources when expanding the number of contributors and committers, I made these changes to help make involvement easier. The more accessible and clear the code-base, the more likely it is to be maintained and improved in the future.

@ilooner
Copy link
Contributor

ilooner commented Oct 29, 2015

+1 for adding documentation and making things more modular

@pramodin
Copy link
Contributor

I think we need to be careful with code that is in the critical path of data transfer as we cannot sacrifice performance for better readability so we cannot make it a general rule. I need to look at this commit in a little bit more particular detail but it doesn't look like the code is in the critical path as it is around deployment of the operators.

@243826
Copy link
Contributor

243826 commented Oct 29, 2015

@ilganeli the coding guidelines are there to leave no room for different interpretation of what commonly agreed best practice is. So before introducing personal flavors, please get it approved in the same fashion as the coding guidelines were approved.

jkl = xyz(a, b, c)

to

jkl = 
  x(a
     b,
     c);

is very subjective in terms of value add.

You said it correctly that you are looking at the code for the first time so it's difficult. There is no expectation that people will get it right the first time they read it. After your reformatting, I am looking at the changes you made and now I find it difficult to comprehend what's going on :-)

Separately - please do not submit reformatting changes. There are tools which can do it in one shot and probably save your and reviewers' time for better utilization elsewhere. May be if you want to work on these tools, I can share some code with you. Doing this piecemeal or without non format related changes just mucks with the history and context crippling the tools which help us utilize them. On top, it wrongly attributes it to a person who in some cases may not even understand the code.

If you are fixing a bug - then of course go ahead and reformat the newly written / modified code. Keeping the changes local will be nice as others working on the code will know what exactly changed. This will also make sure that we do not try to fix something that is not broken.

@ilooner
Copy link
Contributor

ilooner commented Oct 29, 2015

@243826 I think there is value in breaking up code into functions. At some points the original deployInputStreams function is nested 5 levels deep. That may be easy for you to read but for some people that is difficult to read. Maybe we can share these changes with a wider audience on the dev mailing list to get more opinions?

I also agree that it is important to preserve attribution. I know it is possible to change the author of a commit, maybe @ilganeli can provide a dummy commit which restores the original author of the code that has been moved to separate functions.

If you still feel this is not the place to be making these changes could you suggest a ticket for @ilganeli to work on, so that we can utilize his contributions?

@ilganeli
Copy link
Contributor Author

@ilooner @243826 I think maintaining provenance of the code is definitely important, how would you recommend submitting a patch that maintains this?

With regards to formatting, the formatting in this patch is consistent with the style guide and check style as it stood when I submitted the patch. Since both of those have undergone significant changes since that point, I've just submitted an update which brings this code up to date.

@PramodSSImmaneni I agree that touching critical path code is risky, hence my ask for a comprehensive review. If it proves too difficult to analyze this patch, I could submit a series of smaller patches based on this work that does this in a series of smaller steps.

@otterc
Copy link
Contributor

otterc commented Oct 29, 2015

Hi,
There were a couple of bugs in the configuration settings of IntelliJ IDE and I apologize for that. I think a part of the discussion here arose because of these bugs. I have made fixes to IntelliJ settings in devel-3.
Please let me know if you see more discrepancies.

@vrozov
Copy link
Member

vrozov commented Oct 29, 2015

In general I would suggest to avoid auto-format in IntelliJ or any other IDE. It may be used for the new code, but IMO bulk/blind updates to existing code should be avoided due to possible bugs or unwanted behavior of IDE auto format feature (I am not even sure that all IDEs provide the same functionality). As an alternative I would recommend codestyle plugins available for all major IDEs. For IntelliJ I use CheckStyle-IDEA and it can identify all existing violations.

@ilooner
Copy link
Contributor

ilooner commented Oct 29, 2015

@ilganeli this can change the author of a commit

git commit --amend --author="User Name user@abc.com"

I think there are two options for doing this. You can attribute the entire change to the original author or you can attribute the original code sections to the original author and the new functions to you. This could be done by submitting two commits. The first commit will contain all your changes and something like a // at the end of each line of original code. The second commit would remove the // at the end of each line of original code and will be attributed to the original author. The second commit could have a message like "restoring original author"

I suggest posting this pull request to the dev mailing list to get more feedback about the readability and cleanliness of the change, since there seem to be some differing opinions.

@gauravgopi123
Copy link
Contributor

I think the issue is not about author of the commit. It is more about coding guidelines. If there are approved coding guidelines that are being followed for this fix, then we need to follow same guidelines at couple of other places where there are monolithic functions

@gauravgopi123
Copy link
Contributor

Any AI on this PR? or should this be closed?

@asfgit asfgit closed this in 9ffbc73 Jan 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
7 participants