Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analytics workloads upgrade #416

Merged
merged 24 commits into from
Mar 16, 2023
Merged

Analytics workloads upgrade #416

merged 24 commits into from
Mar 16, 2023

Conversation

UlisesLuzius
Copy link
Collaborator

This commit includes:

  1. in-memory-analytics and graph-analytics: Spark and scala version update for
  2. data-analytics: arguments to control number of cores, memory allocated
  3. data-analytics: separation of dataset to other container
  4. data-analytics: separation of master and slave, allow for IP passing instead of hostname

@UlisesLuzius
Copy link
Collaborator Author

We are missing modifications GitHub CI modifications to build:

  1. Wikimedia dataset
  2. New versions of Spark and Hadoop
  3. Correct the path of bench-analytics/4.0 to bench-analytics

Copy link
Contributor

@xusine xusine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running this PR and I observed one bug: When setting the --yarn-cores of the only worker for Data Analytics, there is no forward progress of the application.

After fixing this bug, we can merge this PR.

@xusine xusine self-requested a review March 16, 2023 16:25
@xusine xusine merged commit b45ce2c into main Mar 16, 2023
@xusine xusine deleted the analytics-upgrade branch March 16, 2023 17:06
@UlisesLuzius UlisesLuzius mentioned this pull request Mar 20, 2023
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants