Skip to content

SAMZA-2604: Datamodel change to capture physical container id for AM HA#1445

Merged
mynameborat merged 5 commits intoapache:masterfrom
lakshmi-manasa-g:amha-dataModelYarnId
Dec 1, 2020
Merged

SAMZA-2604: Datamodel change to capture physical container id for AM HA#1445
mynameborat merged 5 commits intoapache:masterfrom
lakshmi-manasa-g:amha-dataModelYarnId

Conversation

@lakshmi-manasa-g
Copy link
Contributor

Feature: Main feature is Cluster based Job coordinator (aka AM) high availability (HA) (TODO: sep/doc how?). The feature ensures that the new AM can establish connection with already running containers to avoid restarting all running containers when AM dies. This PR enables capturing of the physical execution environment container id (ex: yarn container id "container_123_123") mapping to Samza logical processor id (ex: "0"). In future PRs, this mapping will be used by the new AM.

Changes:

  1. Introduce new Coordinator Stream Message and manager to read/write this message
  2. Container upon launch will write to c-stream, its logical and physical id
  3. Job Coordinator(AM) upon launch will read the mapping of all containers from c-stream.

Tests:

  1. added unit test for new manager
  2. working on tests for other classes as they have no coverage for relevant code.

API changes:

  1. New c-stream message

Usage instructions: None

Upgrade instructions: Backwards compatible. N/A

Copy link
Contributor

@mynameborat mynameborat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about this PR and ended up reviewing these changes as part of the orchestration PR. Sorry about that. let me know if thats okay or I can port the comments over to reduce overhead.

@lakshmi-manasa-g lakshmi-manasa-g changed the title [WIP-tests tbd] SAMZA-2604: Datamodel change to capture physical container id for AM HA SAMZA-2604: Datamodel change to capture physical container id for AM HA Dec 1, 2020
Copy link
Contributor

@mynameborat mynameborat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment. Rest of it looks good to me.

@mynameborat
Copy link
Contributor

Please resolve the conflict so that I can merge the PR.

@mynameborat mynameborat merged commit c212cac into apache:master Dec 1, 2020
lakshmi-manasa-g added a commit to lakshmi-manasa-g/samza that referenced this pull request Feb 9, 2021
…HA (apache#1445)

Feature:
Main feature is Cluster based Job coordinator (aka AM) high availability. The feature ensures that the new AM can establish connection with already running containers to avoid restarting all running containers when AM dies. This PR enables capturing of the physical execution environment container id (ex: yarn container id "container_123_123") mapping to Samza logical processor id (ex: "0"). In future PRs, this mapping will be used by the new AM.

Changes:
Introduce new Coordinator Stream Message and manager to read/write this message
Container upon launch will write to c-stream, its logical and physical id
Job Coordinator(AM) upon launch will read the mapping of all containers from c-stream.

Tests:
added unit test for new manager
working on tests for other classes as they have no coverage for relevant code.

API changes:
New c-stream message

Usage instructions: None

Upgrade instructions: Backwards compatible. N/A
tranjith pushed a commit to tranjith/samza that referenced this pull request Mar 23, 2021
…HA (apache#1445)

Feature:
Main feature is Cluster based Job coordinator (aka AM) high availability. The feature ensures that the new AM can establish connection with already running containers to avoid restarting all running containers when AM dies. This PR enables capturing of the physical execution environment container id (ex: yarn container id "container_123_123") mapping to Samza logical processor id (ex: "0"). In future PRs, this mapping will be used by the new AM.

Changes:
Introduce new Coordinator Stream Message and manager to read/write this message
Container upon launch will write to c-stream, its logical and physical id
Job Coordinator(AM) upon launch will read the mapping of all containers from c-stream.

Tests:
added unit test for new manager
working on tests for other classes as they have no coverage for relevant code.

API changes:
New c-stream message

Usage instructions: None

Upgrade instructions: Backwards compatible. N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants