-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAMZA-2439: Remove LocalityManager and container location information from JobModel #1421
Conversation
Looks pretty good at a high level. Left some minor comments, but let's get a second pair of eyes on this for a detailed review. Thanks for this! |
samza-api/src/main/java/org/apache/samza/job/model/HostLocality.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/HostLocality.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
Outdated
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
Outdated
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/coordinator/server/LocalityServlet.java
Outdated
Show resolved
Hide resolved
samza-core/src/main/scala/org/apache/samza/coordinator/JobModelManager.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass, let's resolve those then will take a look at tests closely
samza-api/src/main/java/org/apache/samza/job/model/ContainerLocality.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
Outdated
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/clustermanager/ClusterBasedJobCoordinator.java
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/serializers/model/JsonContainerLocalityMixIn.java
Outdated
Show resolved
Hide resolved
samza-yarn/src/main/java/org/apache/samza/validation/YarnJobValidationTool.java
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/container/LocalityManager.java
Show resolved
Hide resolved
Thanks for the review. Addressed the feedback as discussed offline around |
samza-api/src/main/java/org/apache/samza/context/JobContext.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
Outdated
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/ProcessorLocality.java
Show resolved
Hide resolved
samza-api/src/main/java/org/apache/samza/job/model/ProcessorLocality.java
Show resolved
Hide resolved
samza-core/src/main/java/org/apache/samza/coordinator/server/LocalityServlet.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg, left some comments
samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
Outdated
Show resolved
Hide resolved
samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
Outdated
Show resolved
Hide resolved
samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
Outdated
Show resolved
Hide resolved
samza-core/src/main/scala/org/apache/samza/coordinator/JobModelManager.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the tests and add todo and Ship It!
… from JobModel (apache#1421) Issues Currently locality information is part of job model. Job model typically is immutable and fixed within the lifecycle of an application attempt. The locality information on the other hand is dynamic and changes in the event of container movements. Due to this difference, it makes it complicated to program, model or define semantics around these models when building features. Furthermore, by removing this dependency - Enables us to move JobModel to public APIs and expose it in JobContext - Enables us to cache and serve serialized JobModel from the AM servlet to reduce AM overhead (memory, open connections, num threads) during container startup, esp. for jobs with a large number of containers (See: apache#1241) - Removes tech debt: models should be immutable, and should not update themselves. - Removes tech debt: makes current container location a first class concept for container scheduling / placement , and for tools like dashboard, samza-rest, auto-scaling, diagnostics etc. Changes - Separated out locality information out of job model into LocalityModel - Introduced an endpoint in AM to serve locality information - Added Json MixIns for locality models (LocalityModel & ContainerLocality) - Moved JobModel to samza-api and exposed through JobContext API Changes: - Introduced new models for locality. - Previous job model endpoint will no longer serve locality information. i.e. tools using these will need to update to use the new endpoint. - Expose JobModel via JobContext
… from JobModel (apache#1421) Issues Currently locality information is part of job model. Job model typically is immutable and fixed within the lifecycle of an application attempt. The locality information on the other hand is dynamic and changes in the event of container movements. Due to this difference, it makes it complicated to program, model or define semantics around these models when building features. Furthermore, by removing this dependency - Enables us to move JobModel to public APIs and expose it in JobContext - Enables us to cache and serve serialized JobModel from the AM servlet to reduce AM overhead (memory, open connections, num threads) during container startup, esp. for jobs with a large number of containers (See: apache#1241) - Removes tech debt: models should be immutable, and should not update themselves. - Removes tech debt: makes current container location a first class concept for container scheduling / placement , and for tools like dashboard, samza-rest, auto-scaling, diagnostics etc. Changes - Separated out locality information out of job model into LocalityModel - Introduced an endpoint in AM to serve locality information - Added Json MixIns for locality models (LocalityModel & ContainerLocality) - Moved JobModel to samza-api and exposed through JobContext API Changes: - Introduced new models for locality. - Previous job model endpoint will no longer serve locality information. i.e. tools using these will need to update to use the new endpoint. - Expose JobModel via JobContext
… from JobModel (apache#1421) Issues Currently locality information is part of job model. Job model typically is immutable and fixed within the lifecycle of an application attempt. The locality information on the other hand is dynamic and changes in the event of container movements. Due to this difference, it makes it complicated to program, model or define semantics around these models when building features. Furthermore, by removing this dependency - Enables us to move JobModel to public APIs and expose it in JobContext - Enables us to cache and serve serialized JobModel from the AM servlet to reduce AM overhead (memory, open connections, num threads) during container startup, esp. for jobs with a large number of containers (See: apache#1241) - Removes tech debt: models should be immutable, and should not update themselves. - Removes tech debt: makes current container location a first class concept for container scheduling / placement , and for tools like dashboard, samza-rest, auto-scaling, diagnostics etc. Changes - Separated out locality information out of job model into LocalityModel - Introduced an endpoint in AM to serve locality information - Added Json MixIns for locality models (LocalityModel & ContainerLocality) - Moved JobModel to samza-api and exposed through JobContext API Changes: - Introduced new models for locality. - Previous job model endpoint will no longer serve locality information. i.e. tools using these will need to update to use the new endpoint. - Expose JobModel via JobContext
Issues
Currently locality information is part of job model. Job model typically is immutable and fixed within the lifecycle of an application attempt. The locality information on the other hand is dynamic and changes in the event of container movements. Due to this difference, it makes it complicated to program, model or define semantics around these models when building features. Furthermore, by removing this dependency
Changes
LocalityModel
LocalityModel
&ContainerLocality
)JobModel
to samza-api and exposed throughJobContext
Tests
API Changes:
JobModel
viaJobContext
Upgrade Instructions: None. Refer to the API changes & the usage instructions below to upgrade your tooling if applicable.
Usage Instructions: The new locality information is served under am endpoint within
locality
sub page. Tooling that used the AM endpoint to fetch locality information will need to be updated as follows.The endpoint supports two types of queries
GET <am-endpoint>/locality
. A sample response will look like the followingprocessorId
in the request. e.g.GET <am-enpoint>/locality?processorId=x
. A sample response will look like the following