Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point users to Dask-Yarn from spark documentation #4770

Merged
merged 2 commits into from May 3, 2019

Conversation

Projects
None yet
2 participants
@mrocklin
Copy link
Member

commented May 2, 2019

This page is our most highly trafficked page.
Lets direct traffic to the Dask-Yarn and JupyterHub-on-Hadoop pages from here.

  • Tests added / passed
  • Passes flake8 dask

cc @jcrist if you wanted to take over here I would welcome that. I suspect that you can speak to this audience with more authority than I can.

Point users to Dask-Yarn from spark documentation [skip ci]
This page is our most highly trafficked page.
Lets direct traffic to the Dask-Yarn and JupyterHub-on-Hadoop pages from here.
@mrocklin

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

Merging this afternoon if there are no comments

Dask workloads on your current infrastructure and vice versa.

In particular, for users coming from traditional Hadoop/Spark clusters (such as
those sold by Cloudera/Hortonworks) you are likely using the Yarn resource

This comment has been minimized.

Copy link
@jcrist

jcrist May 3, 2019

Member

Not likely, this is always true. I'd just delete this sentence and merge this paragraph with the above.

This comment has been minimized.

Copy link
@mrocklin

mrocklin May 3, 2019

Author Member

I think that it might be useful to call out Cloudera/Hortonworks for the readers that identify that they have such a cluster, but don't understand what Yarn is.

This comment has been minimized.

Copy link
@jcrist

jcrist May 3, 2019

Member

My disagreement is with the "likely using the Yarn resource manager" part - this is always true for a Hadoop cluster. Perhaps just drop that part then:

For users coming from traditional Hadoop/Spark clusters (such as those sold by Cloudera/Hortonworks), you can deploy dask on these systems...
@jcrist

This comment has been minimized.

Copy link
Member

commented May 3, 2019

Overall this looks good to me.

@mrocklin mrocklin merged commit 5ad6d30 into dask:master May 3, 2019

@mrocklin mrocklin deleted the mrocklin:spark-yarn-dask branch May 3, 2019

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Point users to Dask-Yarn from spark documentation (dask#4770)
This page is our most highly trafficked page.
Lets direct traffic to the Dask-Yarn and JupyterHub-on-Hadoop pages from here.

Thomas-Z added a commit to Thomas-Z/dask that referenced this pull request May 17, 2019

Point users to Dask-Yarn from spark documentation (dask#4770)
This page is our most highly trafficked page.
Lets direct traffic to the Dask-Yarn and JupyterHub-on-Hadoop pages from here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.