Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-503] Add hudi test suite documentation into the README file of the test suite module #1191

Merged
merged 1 commit into from
Feb 4, 2020

Conversation

yanghua
Copy link
Contributor

@yanghua yanghua commented Jan 6, 2020

What is the purpose of the pull request

Add hudi test suite documentation into the README file of the test suite module

Brief change log

  • Add hudi test suite documentation into the README file of the test suite module

Verify this pull request

This pull request is a trivial rework / code cleanup without any test coverage.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@yanghua
Copy link
Contributor Author

yanghua commented Jan 6, 2020

cc @n3nash

--class org.apache.hudi.bench.job.HudiTestSuiteJob
--workload-yaml-path /path/to/your-workflow-dag.yaml
...
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating a custom Workload Pattern

There are 2 ways to generate a workload pattern:

  1. Programatically
    Choose to write up the entire DAG of operations programatically, take a look at WorkflowDagGenerator class. Once you're ready with the DAG you want to execute, simply pass the class name as follows:
spark-submit
...
...
--class org.apache.hudi.bench.job.HudiTestSuiteJob 
--workload-generator-classname org.apache.hudi.bench.dag.scheduler.<your_workflowdaggenerator>
...
  1. YAML file
    Choose to write up the entire DAG of operations in YAML, take a look at complex-workload-dag-cow.yaml or complex-workload-dag-mor.yaml. Once you're ready with the DAG you want to execute, simply pass the yaml file path as follows:
spark-submit
...
...
--class org.apache.hudi.bench.job.HudiTestSuiteJob 
--workload-yaml-path /path/to/your-workflow-dag.yaml
...

Hey, I think it looks a little clearer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. I have addressed it.

@yanghua
Copy link
Contributor Author

yanghua commented Jan 7, 2020

cc @n3nash

## Entry class to the test suite

```
org.apache.hudi.bench.job.HudiTestSuiteJob.java - Entry Point of the hudi test suite job. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to change the package name here, like remove "bench"


## Configurations required to run the job
```
org.apache.hudi.bench.job.HudiTestSuiteConfig - Config class that drives the behavior of the
Copy link
Contributor

@n3nash n3nash Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, I would also re-check if the names of the classes are the same..

[INFO] hudi-spark ......................................... SUCCESS [ 34.499 s]
[INFO] hudi-utilities ..................................... SUCCESS [ 8.626 s]
[INFO] hudi-cli ........................................... SUCCESS [ 14.921 s]
[INFO] hudi-bench ......................................... SUCCESS [ 7.706 s]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have to re-run and update this with the new package name..

# COPY_ON_WRITE tables
=========================
## Run the following command to start the test suite
spark-submit \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please start your docker and try to run both these commands after renaming and package changes to make sure they run fine..

@n3nash
Copy link
Contributor

n3nash commented Jan 10, 2020

@yanghua Left some comments

@yanghua
Copy link
Contributor Author

yanghua commented Jan 10, 2020

@n3nash Have done renaming work. Please have another look.

@n3nash
Copy link
Contributor

n3nash commented Jan 11, 2020

@yanghua looks good, did you try running it in docker ? Also, can you squash your commits and then I can merge this PR ?

@yanghua
Copy link
Contributor Author

yanghua commented Jan 11, 2020

@yanghua looks good, did you try running it in docker ? Also, can you squash your commits and then I can merge this PR ?

Absolutely, I can squash the commits. Sorry, I did not verify those commands in the docker. My local docker env always has some problems. Can you help to verify them?

@n3nash
Copy link
Contributor

n3nash commented Jan 16, 2020

I verified them, looks ok, please fix the build and I can merge this.

@yanghua

@yanghua
Copy link
Contributor Author

yanghua commented Jan 16, 2020

I verified them, looks ok, please fix the build and I can merge this.

@yanghua

OK, The Travis failure is due to the upgrade of the Spark dependencies. I have rebased the test suite branch. I am trying to figure it out.

@n3nash
Copy link
Contributor

n3nash commented Jan 23, 2020

@yanghua were you able to fix the build ?

@yanghua
Copy link
Contributor Author

yanghua commented Jan 24, 2020

@yanghua were you able to fix the build ?

@n3nash Sorry, I still have no time to figure out the root reason. I am in the Chinese New Year holiday now. If you have time, can you help to locate the issue? I believe it is due to bump the Spark version.

@n3nash
Copy link
Contributor

n3nash commented Jan 28, 2020

@yanghua no worries, happy new year! please take a look at this once you're back from the new year holiday

@n3nash
Copy link
Contributor

n3nash commented Feb 4, 2020

I guess once you rebase this, the build should get fixed (merged your spark upgrade pr)

@yanghua
Copy link
Contributor Author

yanghua commented Feb 4, 2020

I guess once you rebase this, the build should get fixed (merged your spark upgrade pr)

Yes, will rebase this PR and let Travis recheck it again.

@yanghua yanghua merged commit 044759a into apache:hudi_test_suite_refactor Feb 4, 2020
yanghua added a commit to yanghua/incubator-hudi that referenced this pull request Feb 13, 2020
yanghua added a commit that referenced this pull request Feb 13, 2020
yanghua added a commit that referenced this pull request Feb 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants