Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

using flockdb on a cluster of machines #64

Open
dhruvg opened this Issue · 6 comments

2 participants

@dhruvg

The tutorial creates an environment where all shards are on the localhost. I was wondering if someone could share how they went about partitioning the data amongst multiple hosts.

I am tinkering around with gizzmo add-host subcommand but can't get it to work. Is this the right way to go? Also, do I have to update the .gizzmorc file? Overall, I am just unclear how to extend this tutorial to a case where I actually have a cluster with a few machines.

Also, after adding new hosts, and restarting flockdb by running setup-env.sh, will the setup automatically take advantage of multiple hosts? Will the data be erased and reset to default?

Thanks

@ksauzz

Hi,

I think you can do it to use mkshards.rb that creates multiple shards on multiple hosts. But running mkshards.rb seems to need the patch. please apply it.

mkshards.rb usage

 ./mkshards.rb -f conf_file -n shards_count  graph_id

 ex) ./mkshards.rb -f shards.yml -n 100 1

configuration file format

app_host: flapp.server:7920
databases:
  - - mysql.server01:1                  # replica set A
    - mysql.server02:1                  # replica set A 
  - - mysql.server03:1                  # replica set B
    - mysql.server04:1                  # replica set B
  • app_host - A application server (flapp) that receives the request from mkshards.rb. Please choose one of your application servers.
  • databases - MySQL servers and weights. a replica set is declared as array of yaml.

To add new mysql server, you need to setup a server and migrate some shard to a new server.

FYI, setup-env.sh is demo uses only.

enjoy! :)

@dhruvg

Thanks, will try this out later today. Just to be clear, mysql.serverXX:1 are the machines where the graph data will actually reside correct? Also, syntactically, is "mysql" part of the hostname or some prefix to the hostname to indicate the database being used?

@ksauzz

Hi,

mysql.serverXX:1 are the machines where the graph data will actually reside correct?

Yes. mysql.serverXX are MySQL servers that store graph data and shard management data.
see second image on the blog.

Also, syntactically, is "mysql" part of the hostname or some prefix to the hostname to indicate the database being used?

It means hostname as following. :)

app_host: hostname:port
databases:
  - - hostname:weight
    - hostname:weight 
  - - hostname:weight
    - hostname:weight
@dhruvg

Thanks for clarifying. How does all this tie in with the .gizzmorc file? Currently, my .gizzmorc is:

host: localhost
port: 7920

Is the file supposed to list the flapp servers/ports on which flockdb is running? On a related note, what is the process of adding a new flapp server? How do the new servers get educated about the forwarding tables and shard locations, etc...?

Thanks.

@ksauzz

Is the file supposed to list the flapp servers/ports on which flockdb is running?

Yes. please try to use hosts instead of host.

hosts: hostA,hostB
port:7920

what is the process of adding a new flapp server?

The process is,

  1. setup new flapp server.
  2. edit config file to add MySQL servers. (That is production.scala in a production deployment.)
  3. start flapp. Then flapp read the data about the forwarding tables and shard locations from MySQL.
@dhruvg

In the config file described in step 2, there seem to be two places where a database connection is used. Theres are production name server replicas which connect to a database called "flock_edges_production". And there are a second set of database connections which connect to the "edges" database. Just to clarify, the former database stores the forwarding table, etc... and the latter database stores the actual edges for the shards on that database's host. So, when you say add MySQL servers, I am understanding the following:
1. Adding to the set of mysql connections to the "edges" database by using the same hostnames as the ones in the config file you mapped out above.
2. I am confused as to where the forwarding tables and shard location data is supposed to be stored... is it also stored in same hosts that store the "edges" database or is it stored in the flapp server? How does the new flapp server learn about that data? Do I have to make a name server replica to an existing flapp server?

Thanks, and I really appreciate your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.