Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicloud feature enhancements #45

Merged
merged 7 commits into from Sep 24, 2019

Conversation

@kumarashit
Copy link
Contributor

commented Aug 5, 2019

First cut for the design spec

@kumarashit kumarashit changed the title Design Spec for multi-cloud and YIG integration [WIP]Design Spec for multi-cloud and YIG integration Aug 12, 2019
@kumarashit kumarashit force-pushed the kumarashit:aks_daito branch from 2c1f2ca to 8844a74 Aug 22, 2019
@kumarashit kumarashit changed the title [WIP]Design Spec for multi-cloud and YIG integration Multicloud feature enhancements Aug 22, 2019
@@ -0,0 +1,325 @@
# Multi-Cloud feature enhancements

**Author(s)**: [Shufang Zeng](https://github.com/sfzeng) [Neelam Gupta](https://github.com/neelamgupta1491) [Ashit Kumar](https://github.com/kumarashit)

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

Please add comma between authors.

S3 based REST API
Containerized deployment
Metadata Management and monitoring
Easy integration of backend storage adapters (on-prem or Cloud)

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

Please add "* " in front of each line. Without "* ", lines 18-24 are displayed in one paragraph with only a space separating each sentence.

* IBM COS
* Ceph S3

### New features to be supported:

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

Are these new features to be supported without YIG?

* Server-side encryption
* Versioning
* Post object

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

It is still not clear how YIG can help object store backends other than Ceph. Please elaborate. Can these features on YIG Ceph pool be expanded to apply on AWS S3, Azure Blob, etc.?

* Bucket Policy
* Configurable buffered copy object
* HEAD Bucket
* HEAD Object

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

Add a sentence to explain each of these features.


### Non-Goals

Features not supported by OpenSDS multi-cloud (Gelato) or YIG are not included.

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

This is too vague. If you don't have a concrete non-goal, just leave it out.



#### Expanding the YIG S3:
YIG uses tiDB for storing metadata. These metadata are:

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

s/tiDB/TiDB


## Open issues

a) SQL DB ex, TiDB comes with relational SQL server and TiKV helping to get the scalability and performance of NoSQL. Currently OpenSDS multi-cloud management uses MongoDB.

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

Formatting is wrong. Use "* " to replace a)b)c)



#### Modified schema for adding new features
**Bucket**

This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 22, 2019

Collaborator

The format of the following table is wrong.


![gelato-multicloud-yig](gelato-multicloud-yig.png?raw=true "Gelato Multicloud with YIG")


This comment has been minimized.

Copy link
@xing-yang

xing-yang Aug 23, 2019

Collaborator

We need some detailed description on how migration works with YIG integration, i.e., what is the role of each component when moving objects from source bucket to destination bucket.


## Summary

OpenSDS provides multi-cloud management platform. The Goal of multi-cloud is to enable data autonomy for multi-cloud environments. Currently OpenSDS multi-cloud (Gelato) provides multiple features for Object Data Migration, Lifecycle and Management across multiple clouds like AWS, Azure, GCP, Huawei, IBM. Also, Gelato need to increase the multi-cloud supportability matrix. In the next release, OpenSDS will add some key features, add new on-prem Object storage i.e. YIG and integrate with YIG to enhance the multi-cloud features of OpenSDS.

This comment has been minimized.

Copy link
@anvithks

anvithks Aug 23, 2019

nit: Goal -> goal
"Also, Gelato need to ...." -> "Additionally, we need to increase the multi-cloud support of Gelato.
nit: on-prem -> on-premise

## Summary

OpenSDS provides multi-cloud management platform. The Goal of multi-cloud is to enable data autonomy for multi-cloud environments. Currently OpenSDS multi-cloud (Gelato) provides multiple features for Object Data Migration, Lifecycle and Management across multiple clouds like AWS, Azure, GCP, Huawei, IBM. Also, Gelato need to increase the multi-cloud supportability matrix. In the next release, OpenSDS will add some key features, add new on-prem Object storage i.e. YIG and integrate with YIG to enhance the multi-cloud features of OpenSDS.
The purpose of this design is to add new features and analyse and facilitate the OpenSDS multi-cloud (Gelato) integration with YIG (Yet Another Index Gateway)

This comment has been minimized.

Copy link
@anvithks

anvithks Aug 23, 2019

nit: "add new features and analyse and facilitate..." -> "add new features, analyse and facilitate the...."
nit: The YIG acronym can be explained at first occurrence above.

The purpose of this design is to add new features and analyse and facilitate the OpenSDS multi-cloud (Gelato) integration with YIG (Yet Another Index Gateway)


One of the Key components of this is 'S3 service'. S3 service handles s3 API requests for communicating with Object Storage backends. OpenSDS multi-cloud API gateway should be compatible with any s3 compatible API service. This s3 service should provide or should be flexible to provide all the multi-cloud features. [YIG](https://github.com/journeymidnight/yig ) is Object Storage Framework which is s3 compatible. YIG brings in multiple features which fits-in into the multi-cloud requirements.

This comment has been minimized.

Copy link
@anvithks

anvithks Aug 23, 2019

nit: Key -> key
"S3 service handles s3 API..." -> ""S3 service handles S3 API..."
all occurrences of s3 -> S3 or vice versa consistently.
"YIG is Object...." -> "YIG is an Object....."
.."which fits-in into" -> "...which fits into the ..."

YIG brings in some of the features which can complement and enhance OpenSDS multi-cloud management platform. This also brings in opportunity for OpenSDS to be THE Platform for all multi-cloud management. YIG has some features which OpenSDS multi-cloud doesn't have. So YIG can help adding those features into OpenSDS multi-cloud.


How YIG can help in enhancing multi-cloud capabilities of Gelato:

This comment has been minimized.

Copy link
@anvithks

anvithks Aug 23, 2019

This line can be a sub heading instead of a statement.

| --- | --- | --- | --- |
| tier | int32 | F | |
| backendtype | int32 | F | |
| storageclass | int32 | F | Each backendhas it's own definition of storage classes,and each storage class can be mapped to a specific tier in OpenSDS. |

This comment has been minimized.

Copy link
@sfzeng

sfzeng Aug 26, 2019

Need a blank between 'backend' and 'has'.


In current form, YIG supports Ceph as the Object storage. YIG will expand it to support multi-cloud Storage Backend for the new features of OpenSDS multi-cloud
Here is the flow:
a) Client issue an API request

This comment has been minimized.

Copy link
@sfzeng

sfzeng Aug 26, 2019

The format is not right.

* Regulatory compliance
* Data Security

### Bucket Location

This comment has been minimized.

Copy link
@himanshuvar

himanshuvar Aug 27, 2019

Contributor

Can you elaborate on specifying bucket location, Is it for OpenSDS virtual buckets?

#### For this approach here is the high-level architecture:

![gelato-multicloud-yig](gelato-multicloud-yig.png?raw=true "Gelato Multicloud with YIG")

This comment has been minimized.

Copy link
@himanshuvar

himanshuvar Aug 31, 2019

Contributor

In the architecture diagram, What is multiple instances, Scalable? Do we create multiple service instances i.e for s3, datamover. If yes, What would be factors for scaling up the services?


Gelato follows the database per service microservices architecture. This means that all the components will have their own database. Gelato microservices architecture currently provides the flexibility, per service, to have their own DB adapter and interact with the respective database for persistent storage.
With this release, Gelato will support MongoDB for services like Datamover, Dataflow and Backend.
S3 service can support both TiDB and MongoDB. YIG project already comes with TiDB support, which can be leveraged.

This comment has been minimized.

Copy link
@himanshuvar

himanshuvar Aug 31, 2019

Contributor

For supporting multiple databases, Is it both database support at a time? Are we planning to maintain multiple copies of metadata across these database? If yes, Are we considering data consistency?

Copy link
Collaborator

left a comment

LGTM, we can merge and update further on the same.

@wisererik

This comment has been minimized.

Copy link

commented Sep 11, 2019

@kumarashit please update the authors who work on this PR

@kumarashit

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2019

@kumarashit please update the authors who work on this PR

Done. I have addressed most of the comments and added inputs from YIG team.

@wisererik wisererik merged commit 0943f17 into opensds:master Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.