Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Postgresql Statefulsets with Replication to OpenShift/Kubernetes #4598

Closed
patrickdillon opened this issue Apr 18, 2018 · 23 comments
Closed

Comments

@patrickdillon
Copy link
Contributor

As students in Boston University's EC528 Cloud Computing Course, my team has been working with @pdurbin @danmcp & @DirectXMan12 to further work on #4040 & #4168.

I have been working on scaling the postgres pods and am ready to open a pull request. My solution entails creating a statefulset and providing a new command at startup for the centos/postgres image.

@patrickdillon
Copy link
Contributor Author

Pull request: #4599

For context, our project description is available in the class repo, which I just switched from private to public:
https://github.com/BU-NU-CLOUD-SP18/Dataverse-Scaling

@pdurbin
Copy link
Member

pdurbin commented Apr 18, 2018

@patrickdillon thanks for the pull request! As we discussed in today's call ( https://bluejeans.com/s/7dg33/ ) it sounds like there are a few tweaks you might want to make (possibly stetting replicas to 1?) and some additional testing you might want to do before we pass this to QA. If there's anything you need, please let us know! Thanks!

@pdurbin
Copy link
Member

pdurbin commented Apr 19, 2018

@patrickdillon I assigned this issue to you and at the moment it's in code review at our kanban board at https://waffle.io/IQSS/dataverse . I can move it back to the "development" column if you'd like. Please just keep me posted when you'd like more code review or if you're done making commits and want to move it to QA. Thanks!

@pdurbin
Copy link
Member

pdurbin commented Apr 23, 2018

I spoke with @djbrooke about this issue and pull request this morning and the plan is for me to QA it. I'm in the middle of other coding but I'll try to get to it soon. @patrickdillon made some changes which are reflected in the pull request and which we just discussed at https://bluejeans.com/s/ygEhi

@pdurbin
Copy link
Member

pdurbin commented Apr 24, 2018

The change to default.config is going to require me to push a new Dataverse/Glassfish image to Docker Hub. I've been talking to @MichaelClifford about my process for testing and to make it concrete, I'll explain below.

First, I add a remote and switch to the branch behind the pull request:

git fetch EC528-Dataverse-Scaling
git checkout 4598-openshift-postgresql

Then, I edit openshift.json to change the tag from latest to the name of the branch, but I won't be committing this change because when we merge the branch, I want it to still be latest:

murphy:dataverse pdurbin$ vi conf/openshift/openshift.json
murphy:dataverse pdurbin$ git diff conf/openshift/openshift.json
diff --git a/conf/openshift/openshift.json b/conf/openshift/openshift.json
index bf74631..a6c2089 100644
--- a/conf/openshift/openshift.json
+++ b/conf/openshift/openshift.json
@@ -212,7 +212,7 @@
               ],
               "from": {
                 "kind": "ImageStreamTag",
-                "name": "dataverse-plus-glassfish:latest"
+                "name": "dataverse-plus-glassfish:4598-openshift-postgresql"
               }
             }
           },
murphy:dataverse pdurbin$ 

To get ready to push images to DockerHub, I clean out the war file and the installer:

murphy:dataverse pdurbin$ mvn clean
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building dataverse 4.8.6
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ dataverse ---
[INFO] Deleting /Users/pdurbin/github/iqss/dataverse/target
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.808 s
[INFO] Finished at: 2018-04-24T08:53:46-04:00
[INFO] Final Memory: 7M/245M
[INFO] ------------------------------------------------------------------------
murphy:dataverse pdurbin$ cd scripts/installer/
murphy:installer pdurbin$ make clean
/bin/rm -rf dvinstall dvinstall.zip
murphy:installer pdurbin$ 
murphy:installer pdurbin$ cd ../..
murphy:dataverse pdurbin$ 

I take a look at https://hub.docker.com/r/iqss/dataverse-glassfish/tags/ to make sure I'm not going to overwrite someone else's tag.

Then, I run the build script, pushing to the branch name:

murphy:docker pdurbin$ ./build.sh 
No argument supplied. Please specify "branch" or "custom my-custom-tag" for experiments or "stable" if your change won't break anything.
murphy:docker pdurbin$ ./build.sh branch
We'll push a tag to the branch you're on.
Images will be pushed to Docker Hub with the tag "4598-openshift-postgresql".
...

This is the part that takes forever. Stay tuned!

@pdurbin
Copy link
Member

pdurbin commented Apr 24, 2018

@patrickdillon I'm afraid I'm struggling a bit to test pull request #4599.

"dataverse-glassfish" is saying "No deployments. A new deployment will start automatically when an image is pushed to project1/dataverse-plus-glassfish:4598-openshift-postgresql." But I've already pushed the tag to Docker Hub. I'll included screenshots below. I'll also attach my openshift.json file, which has the edit above to change latest to 4598-openshift-postgresql, which is the tag I pushed.

Here's my openshift.json file, renamed with the commit in it so I remember which commit I'm on and with .txt added so I can upload it to GitHub: openshift.json-51cecfc.txt

screen shot 2018-04-24 at 9 31 22 am

screen shot 2018-04-24 at 9 31 28 am

@danmcp @DirectXMan12 if you have any ideas for me, please let me know. I hope it's just that I'm forgetting to do something simple.

@patrickdillon maybe you or one of the other students can try the image I pushed to the 4598-openshift-postgresql tag at https://hub.docker.com/r/iqss/dataverse-glassfish/tags/

@patrickdillon
Copy link
Contributor Author

@pdurbin Unfortunately I never really figured out tags. I spent some time this morning trying to look at this in particular but I would need more time. Perhaps @danmcp or @DirectXMan12 could fix it the right way or show best practices.

I could never get any tags besides latest to work so my workflow was always to copy the image to my own personal repo with the latest tag. So it doesn't solve your exact problem but you could copy the image with a different name and the latest tag to your repo or the IQSS repo. That might let you test until we resolve the tag issue. Let me know if that doesn't make sense.

@danmcp
Copy link
Contributor

danmcp commented Apr 24, 2018

@pdurbin To be clear, if you manually hit the deploy button, does it work?

@pdurbin
Copy link
Member

pdurbin commented Apr 24, 2018

@patrickdillon ok, I guess I'd rather not pollute the "latest" tag at https://hub.docker.com/r/iqss/ with experimental images but it sounds like you're saying I could set up a "pdurbin" Docker Hub account or whatever instead of using the "iqss" organization. This is like how I have my own fork of Dataverse under my GitHub username. That way I could leave the tag along and use "latest". Sure. I could try that. Good idea Thanks. I didn't realize that you were having an issue with tags.

@danmcp right. Here are screenshots from when I tried clicking "Deploy":

screen shot 2018-04-24 at 9 18 43 am

screen shot 2018-04-24 at 9 18 50 am

It's as if the tag I expected to be there (4598-openshift-postgresql) never got downloaded from Docker Hub to my laptop:

screen shot 2018-04-24 at 9 22 36 am

screen shot 2018-04-24 at 9 22 52 am

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

Ok, I'm hacking on build.sh a bit (having committed or pushed yet) to get the images up on https://hub.docker.com/u/pdurbin/

Please stay tuned. Here are the changes I made:

murphy:docker pdurbin$ git diff build.sh
diff --git a/conf/docker/build.sh b/conf/docker/build.sh
index 95ff41b..9ed27c0 100755
--- a/conf/docker/build.sh
+++ b/conf/docker/build.sh
@@ -1,10 +1,15 @@
 #!/bin/sh
 # Creates images and pushes them to Docker Hub.
-# The "latest" tag should be relatively stable. Don't push breaking changes there.
+# The "latest" tag under "iqss" should be relatively stable. Don't push breaking changes there.
 # None of the tags are suitable for production use. See https://github.com/IQSS/dataverse/issues/4040
-# Push to custom tags or tags based on branch names to iterate on the images.
+# To interate on images, push to custom tags or tags based on branch names or a non-iqss Docker Hub org/username.
+# Docker Hub organization or username
+HUBORG=iqss
+# The most stable tag we have.
+STABLE=latest
+#FIXME: Use a real flag/argument parser. download-files.sh uses "getopts" for example.
 if [ -z "$1" ]; then
-  echo "No argument supplied. Please specify \"branch\" or \"custom my-custom-tag\" for experiments or \"stable\" if your change won't break anything."
+  echo "No argument supplied. For experiments, specify \"branch\" or \"custom my-custom-tag\" or \"huborg <USERNAME/ORG>\". Specify \"stable\" to push to the \"$STABLE\" tag under \"$HUBORG\" if your change won't break anything."
   exit 1
 fi
 
@@ -14,23 +19,32 @@ if [ "$1" == 'branch' ]; then
   TAG=$GIT_BRANCH
 elif [ "$1" == 'stable' ]; then
   echo "We'll push a tag to the most stable tag (which isn't saying much!)."
-  TAG=kick-the-tires
+  TAG=$STABLE
 elif [ "$1" == 'custom' ]; then
-  if [ -z "$1" ]; then
-    echo "You must provide a custom tag as the second argument."
+  if [ -z "$2" ]; then
+    echo "You must provide a custom tag as the second argument. Something other than \"$STABLE\"."
     exit 1
   else
     echo "We'll push a custom tag."
     TAG=$2
   fi
+elif [ "$1" == 'huborg' ]; then
+  if [ -z "$2" ]; then
+    echo "You must provide your Docker Hub organization or username as the second argument. \"$USER\" or whatever."
+    exit 1
+  else
+    HUBORG=$2
+    TAG=$STABLE
+    echo "We'll push to the Docker Hub organization or username you specified: $HUBORG."
+  fi
 else
   echo "Unexpected argument: $1. Exiting. Run with no arguments for help."
   exit 1
 fi
-echo Images will be pushed to Docker Hub with the tag \"$TAG\".
+echo Images will be pushed to Docker Hub org/username \"$HUBORG\" with the tag \"$TAG\".
 # Use "conf" directory as context so we can copy schema.xml into Solr image.
-docker build -t iqss/dataverse-solr:$TAG -f solr/Dockerfile ../../conf
-docker push iqss/dataverse-solr:$TAG
+docker build -t $HUBORG/dataverse-solr:$TAG -f solr/Dockerfile ../../conf
+docker push $HUBORG/dataverse-solr:$TAG
 # TODO: Think about if we really need dataverse.war because it's in dvinstall.zip.
 cd ../..
 mvn clean
@@ -58,6 +72,6 @@ if [[ "$?" -ne 0 ]]; then
 fi
 # We'll assume at this point that the download script has been run.
 cp ../../downloads/weld-osgi-bundle-2.2.10.Final-glassfish4.jar dataverse-glassfish
-docker build -t iqss/dataverse-glassfish:$TAG dataverse-glassfish
+docker build -t $HUBORG/dataverse-glassfish:$TAG dataverse-glassfish
 # FIXME: Check the output of `docker build` and only push on success.
-docker push iqss/dataverse-glassfish:$TAG
+docker push $HUBORG/dataverse-glassfish:$TAG
murphy:docker pdurbin$ 

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

@patrickdillon success! It worked on the first try when I pushed images to https://hub.docker.com/u/pdurbin/ rather than https://hub.docker.com/u/iqss/ and left the tag alone, keeping it as "latest". Here's the change from "iqss" to "pdurbin" I made locally, which I won't commit:

murphy:dataverse pdurbin$ git diff conf/openshift/openshift.json
diff --git a/conf/openshift/openshift.json b/conf/openshift/openshift.json
index bf74631..a0706a3 100644
--- a/conf/openshift/openshift.json
+++ b/conf/openshift/openshift.json
@@ -100,7 +100,7 @@
         "name": "dataverse-plus-glassfish"
       },
       "spec": {
-        "dockerImageRepository": "iqss/dataverse-glassfish"
+        "dockerImageRepository": "pdurbin/dataverse-glassfish"
       }
     },
     {
@@ -120,7 +120,7 @@
         "name": "iqss-dataverse-solr"
       },
       "spec": {
-        "dockerImageRepository": "iqss/dataverse-solr"
+        "dockerImageRepository": "pdurbin/dataverse-solr"
       }
     },
     {
murphy:dataverse pdurbin$ 

Here's a screenshot of Dataverse running under this new OpenShift config. I'm still at 51cecfc because I haven't committed my change to build.sh above. I created a dataverse and a dataset to make sure they are indexed into Solr:

screen shot 2018-04-25 at 1 05 00 pm

I can see we're using Stateful Sets now ("Technology Preview"! 😄 ):

screen shot 2018-04-25 at 1 06 38 pm

Questions:

  • You don't have any objection to the changes I made to build.sh do you? I assume you don't.
  • How else should I be testing this pull request? Over at https://groups.google.com/d/msg/dataverse-community/TSxf4MTYYjg/7VJB_-GJBAAJ you posted a YouTube video with a demo. Should I rewatch that video and try to figure out how to test things further? Or can you please provide me with some guidance? Thanks!

@patrickdillon
Copy link
Contributor Author

@pdurbin I think the change to build.sh is great. I am glad to see these changes work with the Solr update.

Regarding further testing, in the pull request there is a short example of how to open the psql client on each container to show replication. In my example I just listed tables but once you open the client you could obviously do any sql query; such as check for your new dataverse. Let me kno if those instructions are unclear.

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

@patrickdillon ah, thanks. I didn't even think to look at the pull request but now I have more questions. And these are fundamental OpenShift questions so please forgive me. I only have one instance of postgres, right? Since replicas is set to 1. How in the GUI or via oc do I increase the number of replicas? Otherwise, there's no replication to see, is there? I'm sorry if I'm just completely confused. I know we've been talking about this for weeks but I haven't gotten my hands dirty with any of it.

I'm glad you're cool with the change to build.sh. It wasn't great to have iqss hard-coded in there. Since I believe you gave me push access, I'll probably just push the change into your branch.

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

Whoops, I meant to put the issue number rather than the pull request number in my commit for build.sh but here it is: e7a56c7

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

How in the GUI or via oc do I increase the number of replicas?

I just found oc scale dc/deployment-example --replicas=3 at https://docs.openshift.org/3.6/dev_guide/deployments/deployment_strategies.html so I might try that tomorrow. (I'm no longer at my desk.)

@patrickdillon honestly, I'm thinking about going ahead and merging your pull request because it doesn't seem to do any harm and I'd like @MichaelClifford to have his changed made on top of your changes. I assume you're both editing openshift.json. Any thoughts on this from either of you?

@patrickdillon
Copy link
Contributor Author

@pdurbin Oh, right, I forgot I reduced the number of replicas before committing. The only way I know of increasing the number of replicas is by editing the .json and starting a new project. That should be enough for testing, but there is probably another way, such as oc scale (just guessing) for increasing the number of replicas in the stateful set.

@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

@patrickdillon whoops, I think we both posted at the same time. 😃

How do you and @MichaelClifford feel about me merging your pull request and maybe testing some of the replica stuff in his? His will have replicas too, right? Three Glassfish replicas or whatever? Again, I'm mostly just trying to stay conscious of merge conflicts in openshift.json and it would be nice to know that the two efforts are compatible with each other.

@patrickdillon
Copy link
Contributor Author

patrickdillon commented Apr 25, 2018

@pdurbin That sounds good. Personally, I think merging it would be fine and it will work. Obviously we need to test, but I expect that to work out.

Regarding process, should we submit a new pr after having merged on our fork?

pdurbin added a commit that referenced this issue Apr 25, 2018
…postgresql

Add Postgresql statefulsets with master/slave replication on OpenShift. #4598
@pdurbin
Copy link
Member

pdurbin commented Apr 25, 2018

@patrickdillon @MichaelClifford I just merged the pull request for this issue (#4599) because (again) it does no harm and even thought I haven't full tested all the replica goodness there will be another opportunity in the new pull request that @MichaelClifford makes for #4617. So let's move our attention to that issue and make sure we branch from the latest in develop so we can make further edits to openshift.json. Thanks for this first pull request! Awesome stuff.

@pdurbin
Copy link
Member

pdurbin commented Apr 26, 2018

As I mentioned at standup, I wanted to spend a little more time on this issue before moving on since I have it all running on my laptop anyway.

At a high level, I wanted to:

  • Figure out how to increase the number of postgres replicas from one.
  • Confirm that data is being replicated from a postgres master to slaves.
  • Kill the the postgres master and see what happens.
  • Think about what the next steps might be.

I'm happy to report that I was able to do all of the things above, but as a project we should think more about where we're going with this effort. It's really interesting technology. I'll go through each of the items above.

Scaling the number of replicas

In a previous comment, I already posted an image of how the number of replicas is set to one. Here's how you can tell from the command line:

murphy:dataverse pdurbin$ oc describe statefulset/dataverse-postgresql
Name:			dataverse-postgresql
Namespace:		project1
CreationTimestamp:	Wed, 25 Apr 2018 12:56:51 -0400
Selector:		name=iqss-dataverse-postgresql
Labels:			app=dataverse
			name=iqss-dataverse-postgresql
Annotations:		openshift.io/generated-by=OpenShiftNewApp
			template.alpha.openshift.io/wait-for-ready=true
Replicas:		1 desired | 1 total
Pods Status:		1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:	app=dataverse
		name=iqss-dataverse-postgresql
  Containers:
   centos-postgresql-94-centos7:
    Image:	centos/postgresql-94-centos7
    Port:	5432/TCP
    Command:
      sh
      -c
      echo 'Setting up Postgres Master/Slave replication...'; [[ `hostname` =~ -([0-9]+)$ ]] || exit 1; ordinal=${BASH_REMATCH[1]}; if [[ $ordinal -eq 0 ]]; then run-postgresql-master; else run-postgresql-slave; fi;
    Limits:
      memory:	256Mi
    Environment:
      POSTGRESQL_USER:			dvnapp
      POSTGRESQL_MASTER_USER:		master
      POSTGRESQL_PASSWORD:		secret
      POSTGRESQL_MASTER_PASSWORD:	master
      POSTGRESQL_MASTER_SERVICE_NAME:	dataverse-postgresql-service
      POSTGRESQL_MASTER_IP:		dataverse-postgresql-0.dataverse-postgresql-service
      postgresql_master_addr:		dataverse-postgresql-0.dataverse-postgresql-service
      master_fqdn:			dataverse-postgresql-0.dataverse-postgresql-service
      POSTGRESQL_DATABASE:		dvndb
      POSTGRESQL_ADMIN_PASSWORD:	secret
    Mounts:				<none>
  Volumes:				<none>
Volume Claims:				<none>
Events:					<none>
murphy:dataverse pdurbin$ 

To scale from 1 postgres replica to 3, you can run oc scale statefulset/dataverse-postgresql --replicas=3 like this (I don't know how to do this from the GUI) and then check the results again:

murphy:dataverse pdurbin$ oc scale statefulset/dataverse-postgresql --replicas=3
statefulset "dataverse-postgresql" scaled
murphy:dataverse pdurbin$ 
murphy:dataverse pdurbin$ oc describe statefulset/dataverse-postgresql
Name:			dataverse-postgresql
Namespace:		project1
CreationTimestamp:	Wed, 25 Apr 2018 12:56:51 -0400
Selector:		name=iqss-dataverse-postgresql
Labels:			app=dataverse
			name=iqss-dataverse-postgresql
Annotations:		openshift.io/generated-by=OpenShiftNewApp
			template.alpha.openshift.io/wait-for-ready=true
Replicas:		3 desired | 3 total
Pods Status:		3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:	app=dataverse
		name=iqss-dataverse-postgresql
  Containers:
   centos-postgresql-94-centos7:
    Image:	centos/postgresql-94-centos7
    Port:	5432/TCP
    Command:
      sh
      -c
      echo 'Setting up Postgres Master/Slave replication...'; [[ `hostname` =~ -([0-9]+)$ ]] || exit 1; ordinal=${BASH_REMATCH[1]}; if [[ $ordinal -eq 0 ]]; then run-postgresql-master; else run-postgresql-slave; fi;
    Limits:
      memory:	256Mi
    Environment:
      POSTGRESQL_USER:			dvnapp
      POSTGRESQL_MASTER_USER:		master
      POSTGRESQL_PASSWORD:		secret
      POSTGRESQL_MASTER_PASSWORD:	master
      POSTGRESQL_MASTER_SERVICE_NAME:	dataverse-postgresql-service
      POSTGRESQL_MASTER_IP:		dataverse-postgresql-0.dataverse-postgresql-service
      postgresql_master_addr:		dataverse-postgresql-0.dataverse-postgresql-service
      master_fqdn:			dataverse-postgresql-0.dataverse-postgresql-service
      POSTGRESQL_DATABASE:		dvndb
      POSTGRESQL_ADMIN_PASSWORD:	secret
    Mounts:				<none>
  Volumes:				<none>
Volume Claims:				<none>
Events:
  FirstSeen	LastSeen	Count	From		SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----		-------------	--------	------			-------
  2m		2m		1	statefulset			Normal		SuccessfulCreate	create Pod dataverse-postgresql-1 in StatefulSet dataverse-postgresql successful
  2m		2m		1	statefulset			Normal		SuccessfulCreate	create Pod dataverse-postgresql-2 in StatefulSet dataverse-postgresql successful
murphy:dataverse pdurbin$ 

The GUI also reflects that the number of postgres replicas is now 3:

screen shot 2018-04-26 at 11 46 52 am

Confirm that data is being replicated from a postgres master to slaves

"dataverse-postgresql-0" is the master so I connected to the console for "dataverse-postgresql-1" to make sure data is being replicated there, and it is:

screen shot 2018-04-26 at 12 01 58 pm

I then followed the approach by @patrickdillon in the video posted at https://groups.google.com/d/msg/dataverse-community/TSxf4MTYYjg/7VJB_-GJBAAJ to make an edit in Dataverse and check that the edit is replicated from the postgres master to a postgres slave:

screen shot 2018-04-26 at 12 02 25 pm
screen shot 2018-04-26 at 12 02 50 pm

Kill the the postgres master and see what happens

Then I deleted the postgres master and not surprisingly, Dataverse isn't happy about this, showing errors on the home page and a stack trace in the Glassfish log:

screen shot 2018-04-26 at 12 03 15 pm

screen shot 2018-04-26 at 12 03 27 pm
screen shot 2018-04-26 at 12 04 11 pm
screen shot 2018-04-26 at 12 07 26 pm
screen shot 2018-04-26 at 12 07 40 pm

Think about what the next steps might be

What surprised me a bit is that the postgres master came back. I guess this is because I set the number of replicas to 3 so the system is just bringing the number back in line with what I had declared. However, while the newly recreated master is back and had a "dvndb" database, the database was empty:

screen shot 2018-04-26 at 12 49 57 pm

I'm not familiar enough with postgres to know what to do to get Dataverse back up at this point but I guess I'd work on...

  • promoting a slave to a master
  • restoring data to the master

Something like that. Anyway, I don't want this disaster recover scenario I'm talking about to overshadow the fact that all of this automatic replication is very cool and a great step forward. Thanks you @patrickdillon and the rest of the students for working on this!

I'm going to close this issue but anyone reading this is welcome to leave comments!

@pdurbin pdurbin closed this as completed Apr 26, 2018
@pdurbin pdurbin removed their assignment Apr 26, 2018
@patrickdillon
Copy link
Contributor Author

@pdurbin Regarding master recovery we were aware of this issue but I forgot to document in the pull request. I have been working with @ryanmorano on recovering the master's state. The solution we are trying is to create a persistent volume on the cluster and then a persistent volume claim for the statefulset. In our testing, we were only able to make this work with one replica but after discussion with @danmcp he suggested that if we create as many persistent volumes as statefulsets then it should work. We have yet to test it.

@pdurbin
Copy link
Member

pdurbin commented Apr 26, 2018

@patrickdillon right, now that you mention it I remember that we all talked about how the data in these containers is ephemeral and talk about persistent volumes. And back when I did the initial work on OpensShift support in #4168 I added an item under "known issues" in this area. http://guides.dataverse.org/en/4.8.6/developers/containers.html#known-issues-with-dataverse-images-on-docker-hub says, "The storage should be abstracted. Storage of data files and PostgreSQL data. Probably Solr data." This business of persistent data is what I was getting at.

I still think it's neat that the replication "just works". You can have only a postgres master and be humming along. Then you come along and bump up the number of replicas and each of those new slaves get the data from the master. It's like magic. 😄

@pdurbin
Copy link
Member

pdurbin commented Apr 27, 2018

I just ran ./build.sh stable to push images to the "latest" tag:

That way anyone who tries to follow http://guides.dataverse.org/en/4.8.6/developers/containers.html#openshift will get a working Dataverse installation. It works on my machine, anyway.

It's easy to forget to push to "latest" on Docker Hub after a pull request that touches openshift.json get merged. Once the Dataverse on OpenShift effort becomes production-ready, I'm sure we'll formalize this process more.

Anyway, again, I just tested "latest" after pushing and it seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants