Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I extend the storage size after deployment? #4364

Closed
Alexander-He opened this issue May 18, 2017 · 16 comments
Closed

How can I extend the storage size after deployment? #4364

Alexander-He opened this issue May 18, 2017 · 16 comments

Comments

@Alexander-He
Copy link

@Alexander-He Alexander-He commented May 18, 2017

I have a situation, when I deploy my minio server, I don't know what storage size I need.

How can I extend the storage size after deployment?

Your Environment

  • Version used (minio version):
    Version: 2017-03-16T21:50:32Z
    Release-Tag: RELEASE.2017-03-16T21-50-32Z
    Commit-ID: 5311eb2
  • Server type and version:
    Deploy type : 4 minio nodes share 1 drive, 4 minio client are directly deploy on operating system.
  • Operating System and version (uname -a):
    Linux LFG1000644016 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Link to your project:
    n/a
@krishnasrinivas
Copy link
Member

@krishnasrinivas krishnasrinivas commented May 18, 2017

@Alexander-He you will have to launch a new minio instance with new storage and start using it.

@nitisht
Copy link
Contributor

@nitisht nitisht commented May 18, 2017

@Alexander-He Minio recommends cloud-native approach for scaling your storage infrastructure. When you run out of storage, you simply spin new Minio instance(s) per tenant. A tenant can be a user, a group of users, or any other kind of aggregation of data.

Take for example, a CCTV video storage use case. Initially you wouldn't know how much storage you possibly need to store the videos. In such cases, you can assign one Minio instance to store videos from one month, another Minio instance to store videos for second month and so on. To make sure the application remembers which data is stored in which Minio instance, you can have a database storing the mapping from date-range to corresponding Minio instance.

This way you can scale infinitely, while your storage and application remain relatively simple.

Take a look at our multi-tenant deployment guide: https://docs.minio.io/docs/multi-tenant-minio-deployment-guide

@deekoder deekoder added this to the Edge cache milestone May 18, 2017
@deekoder
Copy link
Contributor

@deekoder deekoder commented May 18, 2017

Related to #4366

@Alexander-He
Copy link
Author

@Alexander-He Alexander-He commented May 19, 2017

@nitisht I see, this helped me, thank you very much,

@harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented Apr 1, 2020

https://docs.minio.io/docs/distributed-minio-quickstart-guide.html expansion is already part of the main storage stack, so anyone landing here please read the latest documentation.

@ericmutta
Copy link

@ericmutta ericmutta commented May 13, 2021

https://docs.minio.io/docs/distributed-minio-quickstart-guide.html expansion is already part of the main storage stack, so anyone landing here please read the latest documentation.

@harshavardhana / @nitisht I have some questions about the above link. Suppose I start off with 4 nodes and run this:

minio server http://host{1...4}/export{1...16}

...then as per the docs, I want to expand by adding more nodes and it says I should run a command like this:

minio server http://host{1...4}/export{1...16} http://host{5...12}/export{1...16}

The questions:

  1. Do I run the longer command on just the newly added nodes or do I have to go back to the initial 4 nodes and run it there too?
  2. If I have to run it on both sets of nodes, does the order in which I do so matter?
  3. When I run it on the initial set of nodes where MinIO was already running, do I have to stop MinIO first? Does that disrupt active connections?

I am planning to run MinIO on cloud VMs that are dynamically created using an API (e.g. in batches of 4 nodes at a time) and I want all instances to appear as one unified data store to applications that access it. I would be interested in knowing how MinIO can be configured to work in this scenario (ideally MinIO would just enumerate other instances using DNS and expand itself).

@harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented May 13, 2021

https://docs.minio.io/docs/distributed-minio-quickstart-guide.html expansion is already part of the main storage stack, so anyone landing here please read the latest documentation.

@harshavardhana / @nitisht I have some questions about the above link. Suppose I start off with 4 nodes and run this:

minio server http://host{1...4}/export{1...16}

...then as per the docs, I want to expand by adding more nodes and it says I should run a command like this:

minio server http://host{1...4}/export{1...16} http://host{5...12}/export{1...16}

The questions:

  1. Do I run the longer command on just the newly added nodes or do I have to go back to the initial 4 nodes and run it there too?

Yes

  1. If I have to run it on both sets of nodes, does the order in which I do so matter?

Yes all command line have to be same.

  1. When I run it on the initial set of nodes where MinIO was already running, do I have to stop MinIO first? Does that disrupt active connections?

Yes of course.

@ericmutta
Copy link

@ericmutta ericmutta commented May 13, 2021

  1. If I have to run it on both sets of nodes, does the order in which I do so matter?

Yes all command line have to be same.

@harshavardhana thanks for the fast response! Let me clarify here. First I have 4 nodes so I run this:

minio server http://host{1...4}/export{1...16}

Next I want to add other nodes, now the command has to be:

minio server http://host{1...4}/export{1...16} http://host{5...12}/export{1...16}

My question is: if I run the command on the new nodes, then they are aware of the old nodes, but the old nodes are not aware of the new nodes until I restart MinIO on each one of them...it kinda resembles a "split brain" scenario of sorts and I am wondering if that can cause problems.

@harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented May 13, 2021

My question is: if I run the command on the new nodes, then they are aware of the old nodes, but the old nodes are not aware of the new nodes until I restart MinIO on each one of them...it kinda resembles a "split brain" scenario of sorts and I am wondering if that can cause problems.

All command lines have to be the same everywhere - it can be in the expanded and non-expanded setup. You cannot add new pools to a running system.

@ericmutta
Copy link

@ericmutta ericmutta commented May 13, 2021

@harshavardhana You cannot add new pools to a running system.

Thanks again for clarifying! I am still working through the docs and have got two MinIO instances on Linode.com that I am experimenting with (love how simple this thing is to use!). If you can point me to any docs that discuss the way pools work, what you can/can't do with them, etc...that would be awesome! 👍

PS: I submitted a few PRs for the docs here and here.

@NMi-ru
Copy link

@NMi-ru NMi-ru commented Jun 4, 2021

if I run the command on the new nodes, then they are aware of the old nodes

My best-case experience is this:

  1. I start new "A+B" config on the new set of nodes (B)
  2. First node of B becomes the coordinator and start spitting log lines like: "marking http://nodeB2,3,4 temporary offline"
  3. As I launch the processes on B-nodes, these lines start to disappear. After I have launched all nodes of B, I can see the message: "Waiting for the first server to format the disks". This is the time to restart A-nodes with the new config.
  4. I restart the first node of A with the new (A+B) config. Here come the warnings of "you want me to start with 32 disks, but already existing data suggests that there should be 16 disks". My guess is that we get some service downtime at this point.
  5. After the last node of A has been restarted with the new config, the first node of A (coordinator) synchronizes the new cluster and everything comes to order: healing process starts and creates/modifies ".minio.sys" directories/buckets to reflect the new configuration; we see "AM initialization complete" on all nodes of the new cluster.

@harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented Jun 4, 2021

if I run the command on the new nodes, then they are aware of the old nodes

My best-case experience is this:

  1. I start new "A+B" config on the new set of nodes (B)
  2. First node of B becomes the coordinator and start spitting log lines like: "marking http://nodeB2,3,4 temporary offline"
  3. As I launch the processes on B-nodes, these lines start to disappear. After I have launched all nodes of B, I can see the message: "Waiting for the first server to format the disks". This is the time to restart A-nodes with the new config.
  4. I restart the first node of A with the new (A+B) config. Here come the warnings of "you want me to start with 32 disks, but already existing data suggests that there should be 16 disks". My guess is that we get some service downtime at this point.
  5. After the last node of A has been restarted with the new config, the first node of A (coordinator) synchronizes the new cluster and everything comes to order: healing process starts and creates/modifies ".minio.sys" directories/buckets to reflect the new configuration; we see "AM initialization complete" on all nodes of the new cluster.

@NMi-ru this has been never tested, so whatever you are attempting here is brave - if it works then 👍🏽 - if it does not then it was never designed to work for these situations.

There is an upcoming change coming that allows for the dynamic expansion of an existing deployment, so that would make this entire approach of waiting for nodes obsolete as you can add pools at runtime.

 ~ cat ../spec.md
 ## Server command line an all servers
 # minio server --drives "/mnt/data{1...4}"

 ## Setup
 # mc admin setup alias/ http://host{1...4} [--root-user "rootuser", --root-password "rootpassword"]

 ## Priority v1
 # mc admin pool add alias/ http://host{5...8} (adds a fresh pool)
 # mc admin pool delete alias/ POOL_ID (drains and spreads across remaining pools)
 # mc admin pool list alias/ [POOL_ID] (lists all pools)
 # mc admin pool suspend alias/ POOL_ID
 # mc admin pool resume alias/ POOL_ID

 ## Priority v2
 # mc admin pool rebalance (balance storage across all pools)

 ## Priority v3
 # mc admin pool merge POOL_ID1 ... POOL_IDN

This will allow for pool management as well as being able to expand at will.

@MoonJustry
Copy link
Contributor

@MoonJustry MoonJustry commented Jul 16, 2021

There is an upcoming change coming that allows for the dynamic expansion of an existing deployment, so that would make this entire approach of waiting for nodes obsolete as you can add pools at runtime.

hi harshavardhana, that sounds great,is there any plan or roadmap to track this feature about dynamic expansion?

@harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented Jul 16, 2021

hi harshavardhana, that sounds great,is there any plan or roadmap to track this feature about dynamic expansion?

It's on my laptop right now - there is no real timelimit on when it will be available all I can say is soon.

@Rush
Copy link

@Rush Rush commented Jul 27, 2021

I have a question regarding the docs:

New object upload requests automatically start using the least used cluster. This expansion strategy works endlessly, so you can perpetually expand your clusters as needed.

Does it mean that old files will not get erasure coding benefits of the expanded nodes?

I would like to all files (both new and old) to have erasure codes spread across all disks.

@Rush
Copy link

@Rush Rush commented Jul 27, 2021

Oh I think I can answer myself. The documentation refers to clusters which are picked up with the three-dots ... syntax. Clusters have independent erasure code sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

9 participants