New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29028][DOCS] Add links to IBM Cloud Object Storage connector in cloud-integration.md #25737
Conversation
cc @srowen Please let me know your thoughts on whether this can be added ? |
docs/cloud-integration.md
Outdated
@@ -257,4 +257,5 @@ Here is the documentation on the standard connectors both from Apache and the cl | |||
* [Amazon EMR File System (EMRFS)](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html). From Amazon | |||
* [Google Cloud Storage Connector for Spark and Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector). From Google | |||
* [The Azure Blob Filesystem driver (ABFS)](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver) | |||
* IBM Cloud Object Storage connector for Apache Spark : [Stocator,](https://github.com/CODAIT/stocator) [IBM Object Storage,](https://www.ibm.com/cloud/object-storage) [how-to-use-connector](https://developer.ibm.com/code/2018/08/16/installing-running-stocator-apache-spark-ibm-cloud-object-storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is more or less the equivalent of "S3 connector docs" for AWS, but for the IBM cloud, it could be OK. However I think this doc is a little more about what Spark directly supports, particularly through hadoop-cloud
. (In any event I think you need to fix the anchors? they have commas in them.) Would this be more appropriate at https://github.com/apache/spark-website/blob/asf-site/third-party-projects.md ? it seems to refer to a third-party integration, not first-party cloud docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Thank you very much for your quick response. So here, i was trying to model after how Google Cloud Storage Connector is specified in this section.
In any event I think you need to fix the anchors? they have commas in them.
So Sean, here i had three links 1) To the connector 2) IBM cloud storage 3) A devworks ariticle that ties them together. So i had them separated by comma. Should i just remove the commas and have just a space as a separator ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I am wondering first whether this is the right place. hadoop-cloud
and thus Spark doesn't have special support for this connector, and that's what this doc is about.
I'm also just noting that it seemed odd to put the comma within the hyperlinked text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Thanks.. I am not a 100% sure about whether this is the right place :-). Could you please help me understand how Google Cloud Storage Connector for Spark and Hadoop
is placed here ? When i click here and navigate to the connector link .. i end up here `https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs which is the connector for Google cloud storage which i thought is similar to the stocator link ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this place is fine, after re-reading the doc. It is a more general reference. I would just fix the links a bit. [...](...),
not [...,](...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Thank you. I have updated per your advice. I have also update the screen-shot.
Test build #110375 has finished for PR 25737 at commit
|
docs/cloud-integration.md
Outdated
@@ -257,4 +257,5 @@ Here is the documentation on the standard connectors both from Apache and the cl | |||
* [Amazon EMR File System (EMRFS)](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html). From Amazon | |||
* [Google Cloud Storage Connector for Spark and Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector). From Google | |||
* [The Azure Blob Filesystem driver (ABFS)](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver) | |||
* IBM Cloud Object Storage connector for Apache Spark : [Stocator,](https://github.com/CODAIT/stocator) [IBM Object Storage,](https://www.ibm.com/cloud/object-storage) [how-to-use-connector](https://developer.ibm.com/code/2018/08/16/installing-running-stocator-apache-spark-ibm-cloud-object-storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it better to add . From IBM
like Amazon and Google?
Test build #110420 has finished for PR 25737 at commit
|
Merged to master |
Thank you very much @srowen |
…n cloud-integration.md ### What changes were proposed in this pull request? Add links to IBM Cloud Storage connector in cloud-integration.md ### Why are the changes needed? This page mentions the connectors to cloud providers. Currently connector to IBM cloud storage is not specified. This PR adds the necessary links for completeness. ### Does this PR introduce any user-facing change? Yes. **Before:** <img width="1234" alt="Screen Shot 2019-09-09 at 3 52 44 PM" src="https://user-images.githubusercontent.com/14225158/64571863-11a2c080-d31a-11e9-82e3-78c02675adb9.png"> **After.** <img width="1234" alt="Screen Shot 2019-09-10 at 8 16 49 AM" src="https://user-images.githubusercontent.com/14225158/64626857-663e4e00-d3a3-11e9-8fa3-15ebf52ea832.png"> ### How was this patch tested? Tested using jykyll build --serve Closes apache#25737 from dilipbiswal/ibm-cloud-storage. Authored-by: Dilip Biswal <dbiswal@us.ibm.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
What changes were proposed in this pull request?
Add links to IBM Cloud Storage connector in cloud-integration.md
Why are the changes needed?
This page mentions the connectors to cloud providers. Currently connector to
IBM cloud storage is not specified. This PR adds the necessary links for
completeness.
Does this PR introduce any user-facing change?
Yes.
Before:
After.
How was this patch tested?
Tested using jykyll build --serve