[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737

dilipbiswal · 2019-09-09T22:58:07Z

What changes were proposed in this pull request?

Add links to IBM Cloud Storage connector in cloud-integration.md

Why are the changes needed?

This page mentions the connectors to cloud providers. Currently connector to
IBM cloud storage is not specified. This PR adds the necessary links for
completeness.

Does this PR introduce any user-facing change?

Yes.

Before:

After.

How was this patch tested?

Tested using jykyll build --serve

…d-integration.md

dilipbiswal · 2019-09-09T23:06:40Z

cc @srowen Please let me know your thoughts on whether this can be added ?

srowen · 2019-09-09T23:11:29Z

docs/cloud-integration.md

@@ -257,4 +257,5 @@ Here is the documentation on the standard connectors both from Apache and the cl
 * [Amazon EMR File System (EMRFS)](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html). From Amazon
 * [Google Cloud Storage Connector for Spark and Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector). From Google
 * [The Azure Blob Filesystem driver (ABFS)](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver)
+* IBM Cloud Object Storage connector for Apache Spark : [Stocator,](https://github.com/CODAIT/stocator) [IBM Object Storage,](https://www.ibm.com/cloud/object-storage) [how-to-use-connector](https://developer.ibm.com/code/2018/08/16/installing-running-stocator-apache-spark-ibm-cloud-object-storage)


If this is more or less the equivalent of "S3 connector docs" for AWS, but for the IBM cloud, it could be OK. However I think this doc is a little more about what Spark directly supports, particularly through hadoop-cloud. (In any event I think you need to fix the anchors? they have commas in them.) Would this be more appropriate at https://github.com/apache/spark-website/blob/asf-site/third-party-projects.md ? it seems to refer to a third-party integration, not first-party cloud docs.

@srowen Thank you very much for your quick response. So here, i was trying to model after how Google Cloud Storage Connector is specified in this section.

In any event I think you need to fix the anchors? they have commas in them.

So Sean, here i had three links 1) To the connector 2) IBM cloud storage 3) A devworks ariticle that ties them together. So i had them separated by comma. Should i just remove the commas and have just a space as a separator ?

Yes, but I am wondering first whether this is the right place. hadoop-cloud and thus Spark doesn't have special support for this connector, and that's what this doc is about.

I'm also just noting that it seemed odd to put the comma within the hyperlinked text.

@srowen Thanks.. I am not a 100% sure about whether this is the right place :-). Could you please help me understand how Google Cloud Storage Connector for Spark and Hadoop is placed here ? When i click here and navigate to the connector link .. i end up here `https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs which is the connector for Google cloud storage which i thought is similar to the stocator link ?

Actually I think this place is fine, after re-reading the doc. It is a more general reference. I would just fix the links a bit. [...](...), not [...,](...)

@srowen Thank you. I have updated per your advice. I have also update the screen-shot.

SparkQA · 2019-09-09T23:12:41Z

Test build #110375 has finished for PR 25737 at commit 8f9da49.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2019-09-10T04:02:04Z

docs/cloud-integration.md

@@ -257,4 +257,5 @@ Here is the documentation on the standard connectors both from Apache and the cl
 * [Amazon EMR File System (EMRFS)](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html). From Amazon
 * [Google Cloud Storage Connector for Spark and Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector). From Google
 * [The Azure Blob Filesystem driver (ABFS)](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver)
+* IBM Cloud Object Storage connector for Apache Spark : [Stocator,](https://github.com/CODAIT/stocator) [IBM Object Storage,](https://www.ibm.com/cloud/object-storage) [how-to-use-connector](https://developer.ibm.com/code/2018/08/16/installing-running-stocator-apache-spark-ibm-cloud-object-storage)


nit: Is it better to add . From IBM like Amazon and Google?

SparkQA · 2019-09-10T15:26:10Z

Test build #110420 has finished for PR 25737 at commit dc6dd1c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2019-09-10T16:20:20Z

Merged to master

dilipbiswal · 2019-09-10T16:43:56Z

Thank you very much @srowen

…n cloud-integration.md ### What changes were proposed in this pull request? Add links to IBM Cloud Storage connector in cloud-integration.md ### Why are the changes needed? This page mentions the connectors to cloud providers. Currently connector to IBM cloud storage is not specified. This PR adds the necessary links for completeness. ### Does this PR introduce any user-facing change? Yes. **Before:** <img width="1234" alt="Screen Shot 2019-09-09 at 3 52 44 PM" src="https://user-images.githubusercontent.com/14225158/64571863-11a2c080-d31a-11e9-82e3-78c02675adb9.png"> **After.** <img width="1234" alt="Screen Shot 2019-09-10 at 8 16 49 AM" src="https://user-images.githubusercontent.com/14225158/64626857-663e4e00-d3a3-11e9-8fa3-15ebf52ea832.png"> ### How was this patch tested? Tested using jykyll build --serve Closes apache#25737 from dilipbiswal/ibm-cloud-storage. Authored-by: Dilip Biswal <dbiswal@us.ibm.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>

[SPARK-29028] Add links to IBM Cloud Object Storage connector in clou…

8f9da49

…d-integration.md

srowen reviewed Sep 9, 2019

View reviewed changes

dongjoon-hyun added the DOCUMENTATION label Sep 10, 2019

kiszk reviewed Sep 10, 2019

View reviewed changes

Code review

dc6dd1c

srowen approved these changes Sep 10, 2019

View reviewed changes

srowen closed this in 7309e02 Sep 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737

[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737

dilipbiswal commented Sep 9, 2019 •

edited

dilipbiswal commented Sep 9, 2019

srowen Sep 9, 2019

dilipbiswal Sep 9, 2019

srowen Sep 9, 2019

dilipbiswal Sep 10, 2019

srowen Sep 10, 2019

dilipbiswal Sep 10, 2019 •

edited

SparkQA commented Sep 9, 2019

kiszk Sep 10, 2019

SparkQA commented Sep 10, 2019

srowen commented Sep 10, 2019

dilipbiswal commented Sep 10, 2019

[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737

[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737

Conversation

dilipbiswal commented Sep 9, 2019 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

dilipbiswal commented Sep 9, 2019

srowen Sep 9, 2019

Choose a reason for hiding this comment

dilipbiswal Sep 9, 2019

Choose a reason for hiding this comment

srowen Sep 9, 2019

Choose a reason for hiding this comment

dilipbiswal Sep 10, 2019

Choose a reason for hiding this comment

srowen Sep 10, 2019

Choose a reason for hiding this comment

dilipbiswal Sep 10, 2019 • edited

Choose a reason for hiding this comment

SparkQA commented Sep 9, 2019

kiszk Sep 10, 2019

Choose a reason for hiding this comment

SparkQA commented Sep 10, 2019

srowen commented Sep 10, 2019

dilipbiswal commented Sep 10, 2019

dilipbiswal commented Sep 9, 2019 •

edited

dilipbiswal Sep 10, 2019 •

edited