Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scenario for deploying and running Java EE batch processing applications with WildFly on OpenShift #62

Merged
merged 28 commits into from
Jan 8, 2018

Conversation

chengfang
Copy link
Contributor

No description provided.

@thesteve0
Copy link
Contributor

thesteve0 commented Sep 28, 2017

Hey @chengfang - thanks for putting this together. Can you please give us the URL for this scenario running in your katacoda account. It is easier to test and review when looking there. For edits, do you want us to do PRs on your repo or just send you page and line #s?

FYI for other peoples reference - https://github.com/jberet/jsr352

@chengfang
Copy link
Contributor Author

@thesteve0 This scenario in my katacoda account:
https://katacoda.com/cfang/courses/intro-openshift/java-batch-processing

For edits, a PR is probably easier, though both ways are okay.

Steven Pousty added 5 commits September 28, 2017 17:45
Looks good, some minor changes, along with the need for 1 more paragraph just to explain the purpose of the Spec. Go ahead and quote a nice paragraph from the spec if you want
and also making the URL nicer
nothing but some edits and corrections on wording
I didn't realize they didn't use empty dir and actually used disk space in the running container
you could make this a lot easier by copying off the terminal. I am also concerned that we are hard coding IPs but I will see what we do next
@thesteve0
Copy link
Contributor

Ok so I think I did the changes the wrong way - I will fork your repo in the future, make a branch and then make edits there.

I am stopping at step 6, which I have not reviewed. It seems like you could wire together the db and Wildfly application in your git repo rather than having the user do all these steps in the terminal. It would make the scenario shorter and clearer.

I also think you may want to do a little explanation of the code. Maybe as a follow up say in the next Scenario (coming soon) we will show you how to write your own simple job and get it running in OpenShift.

Ping me if you want to discuss more we can set up a web chat to discuss. I am on US Pacific Time.

@chengfang
Copy link
Contributor Author

@thesteve0 , thanks for the review! I'll try to respond to some comments here, and will get to more tomorrow.

About getting PostgreSQL POP IP, I currently make user copy it from the POD web terminal. As you suggested, the POD details page also has IP address, but it seems they are different values. I can connect to db with the IP copied from web terminal POSTGRESQL_SERVICE_HOST=172.30.xxx.xxx,
but haven't tried using the IP from POD details page,

Status:
Running
Deployment:
postgresql, #1
IP:
10.128.xx.xx
Node:
ip-172-31-60-39.ec2.internal (172.31.xx.xx)

About wiring the db params directly in my demo app: I was following the steps in the book "OpenShift for Developers" (free download book from Red Hat):

In order to be able to communicate to the database using these environment vari‐ ables, we need to add them to DeploymentConfig for our insult application. Adding the environment variables to DeploymentConfig instead of directly in the running pod ensures that any new pod will be started with the variables it needs to connect. You could certainly just hard code these values in your application code but that is not a best practice! This can be done using the following command:
oc env dc insults -e POSTGRESQL_USER=insult -e PGPASSWORD=insult
POSTGRESQL_DATABASE=insults

I couldn't find a way to directly wire db host value inside my demo app, since the batch app will be running inside WildFly POD, separate from the db POD.

I did not ask user to copy this environment variable into WildFly POD, instead I pass this information as part of REST API request url, which is one step less. The reason I ask the user to set the DB environment variable in client terminal is to have portable commands users can click and run. Alternatively, we can skip setting DB environment variable, and ask the user to remember this db host value somewhere, and later include it in REST API url where needed.

@GrahamDumpleton
Copy link
Contributor

You don't need to work out what the IP address is. The service name for the database, ie., 'postgresql', will be added as a hostname to the internal DNS. As a result any other application is the same project can just use 'postgresql' as a hostname to talk to the database. You do not need to have the IP address.

If the intent of the scenario is to demonstrate batch processing and how to deploy the application and database isn't really that important, similar to intent as Steve expressed, I would look at including in the scenario a pre-existing JSON/YAML file which sets up both front end and database, with all credentials all set up. This way all you have to get user to do to setup things ready is 'oc create -f application.json' where application.json has the resources objects for everything defined in it.

With the setup quickly out of the way you can spend more time on explaining the architecture of the application, what batch processing is all about and give the demonstration. By using pre-existing resource definition file, less chance of people stumbling on the setup and not getting to the batch processing part.

@chengfang
Copy link
Contributor Author

Thanks @thesteve0 and @GrahamDumpleton , I'll rework some of the steps based on the comments so far, and get back to you.

@thesteve0
Copy link
Contributor

+1 to Graham's suggestion - that would be the best way to do it.

@chengfang
Copy link
Contributor Author

I've updated Java Batch Processing scenario, based on review comments. Can you please review them again? Thanks!

Some of the main changes are: change to use oc command line (instead of OpenShift Web Console) to simplifying setting up initial project, application and database; add more content to cover Java batch processing. See commit logs for more details.

@GrahamDumpleton
Copy link
Contributor

Looking good, although I don't understand the application I am showing, so that part doesn't have too much meaning to me. All the commands and sequence appear to execute okay.

A few more comments:

1 - When referring to Source-to-Image with an acronym, personally I prefer to see S2I in upper case. This distinguishes it from s2i the command line program. IOW, S2I == process, s2i == command line tool. The s2i tool although used inside of OpenShift, is something that users don't need to know about usually.

2 - In step 2, rather than use oc status and have them keep doing that until it is ready, you could have them use oc rollout status dc/intro-jberet. This will return only when the deployment is complete. But then you already use oc rollout status with postgresql as I now find when I go to the next step.

Example output is:

$ oc rollout status dc/intro-jberet
Deployment config "intro-jberet" waiting on image update
Waiting for latest deployment config spec to be observed by the controller loop...
Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for rollout to finish: 0 of 1 updated replicas are available...
Waiting for latest deployment config spec to be observed by the controller loop...
replication controller "intro-jberet-1" successfully rolled out

You could add a comment after than saying 'this is a great time to switch to the Dashboard and explore the web console interface to OpenShift and what the build and deployment as it progresses'.

3 - When you give the URL in part 2, if you have it as:

http://intro-jberet-jberet-lab.[[HOST_SUBDOMAIN]]-80-[[KATACODA_HOST]].environments.katacoda.com/

That is, no back ticks and no {{copy}} annotation, then katacoda will automatically detect it as a URL and make it clickable. When you click on it, it should open it in a new tab/window automatically. So can avoid the need to have them copy/paste the URL themselves.

4 - Where you show XML in step 4, just be mindful of browser width. If you are on a large display it may look fine, albeit still with wrap around, but also check on smaller laptop screen. If it looks incomprehensible, the other option is to include in assets sub directory of scenario and reference it in index.json so copied to instance. You could then have user execute cat blah.xml and will display in right side part of screen where text is smaller and wider, and so will not wrap.

5 - In step 5, you say to follow the href URL link given above. This was confusing to me as it suggested I needed to cut and paste it from above and separately put in new browser window, but that will not work as that is for older instance of katacoda environment from when you did run to get the sample output. It needs to be from the actual curl output. You even then give the curl command below. Maybe just reword a bit to better explain that using the href value from the output from the curl command just run and then the clickable example.

6 - You might consider having the commands you run which generate JSON pipe the output through python -m json.tool. This will nicely format the JSON.

$ curl http://intro-jberet-jberet-lab.2886795281-80-ollie02.environments.katacoda.com/intro-jberet/api/jobexecons/1 | python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   356  100   356    0     0   3210      0 --:--:-- --:--:-- --:--:--  3236
{
    "batchStatus": "COMPLETED",
    "createTime": 1508133063625,
    "endTime": 1508133064551,
    "executionId": 1,
    "exitStatus": "COMPLETED",
    "href": "http://intro-jberet-jberet-lab.2886795281-80-ollie02.environments.katacoda.com/intro-jberet/api/jobexecutions/1",
    "jobInstanceId": 1,
    "jobName": "csv2db",
    "jobParameters": null,
    "lastUpdatedTime": 1508133064552,
    "startTime": 1508133063635
}

7 - In summary, instead of using the URL of http://www.openshift.org/vm for Minishift, use https://www.openshift.org/minishift/

@chengfang
Copy link
Contributor Author

Thanks, @GrahamDumpleton . I've updated the scenario based on your comments. For the xml snippet formatting, I rearranged the content, and now they can fit nicely in the left panel. If user still wants a wider viewing area, they can click the link in the 2nd paragraph to view it in github with syntax highlighting.

When we use curl command, shall we use curl -s to supress the curl progress info? It seems extra clutter in the console. I looked at existing scenario, there is only one instance of curl -s, and other uses just use the curl default (with progress info).

@chengfang
Copy link
Contributor Author

Any updates? Is it ready for a merge?

@GrahamDumpleton
Copy link
Contributor

@mhausenblas You want to look at merging this when back from PTO? All the things I highlighted were addressed. Note that the branded-ui file changes will need to be added.

@chengfang
Copy link
Contributor Author

Can someone help move this PR forward?

@mhausenblas
Copy link
Contributor

Seems this one has fallen between the cracks, apologies. Can you please fix the conflicts (new layouting, see here) and then I think we're good to go. Sorry again for the delay :(

… move java batch processing course declaration from introduction-pathway.json to middleware-pathway.json.
…ibute value to "Building Applications on OpenShift".
@chengfang
Copy link
Contributor Author

I've sync'ed up with the upstream, updated java-batch-processing course to follow the new layout (moved java-batch-processing course to under middleware pathway). I've also gone through the updated course in my katacoda account with no problem (https://www.katacoda.com/cfang/courses/middleware/java-batch-processing).

@mhausenblas
Copy link
Contributor

Thank you @chengfang. When going through it, at step 3 I encounter the following error:

oc new-app postgresql-ephemeral --name postgresql --param POSTGRESQL_USER=jberet --param POSTGRESQL_PASSWORD=jberet
error: Errors occurred while determining argument types:

postgresql-ephemeral as a local directory pointing to a Git repository: stat postgresql-ephemeral: nosuch file or directory

Errors occurred during resource creation:
error: no match for "postgresql-ephemeral"

@chengfang
Copy link
Contributor Author

@mhausenblas thanks for reviewing. Yesterday I did go through the course successfully.

My db setup is similar to the one used in course "Connecting to a Database Using Port Forwarding". Now I just tried the port forwarding course in my katacoda account, and had the same error:

$ oc new-app postgresql-ephemeral --name database --param DATABASE_SERVICE_NAME=database --param POSTGRESQL_DATABASE=sampledb --param POSTGRESQL_USER=username --param POSTGRESQL_PASSWORD=password
error: Errors occurred while determining argument types:

Then I went to the official site https://learn.openshift.com/introduction/port-forwarding/ to start the porting forwarding course, and saw the exact the same error.

Could it be that github repo for the postgresql-ephemeral template becomes unreachable from katacoda side, maybe exceeding limits?

@mhausenblas
Copy link
Contributor

Could it be that github repo for the postgresql-ephemeral template becomes unreachable from katacoda side, maybe exceeding limits?

Hmmm, no idea to best honest @chengfang but thanks for confirming it!

@GrahamDumpleton @BenHall any ideas here?

@BenHall
Copy link
Contributor

BenHall commented Jan 5, 2018

Potentially related to #108 ?

… loading from postgresql-ephemeral-template github repo (based on #108).
@chengfang
Copy link
Contributor Author

I updated java-batch-processing/env-init.sh, copying from the example in #108 , but still seeing the same error. So looks like something else is wrong. Any idea?

Plus, the database port forwarding course already contains the fixed env-init.sh (log shows fixed on 12/24/2017), but when running from my account, the port forwarding course still suffered the same error.

@GrahamDumpleton
Copy link
Contributor

It is going to be because it was pulling master and not branch. I found that would be an issue with playgrounds and looks like I forgot about fixing others. See example of what to use at:

The playgrounds have to have timing issue fixed as well.

@GrahamDumpleton
Copy link
Contributor

Pulling master template is wrong, but seems not even getting to that point as the openshift project is empty, ie., no templates loaded. It is like that looping check is succeeding but project is still not read to load templates in it. So may be durability of the check to see if project exists and ready.

@GrahamDumpleton
Copy link
Contributor

Another timing issue most likely. If the env-init.sh runs slowly and the instructions are followed such that you login as developer before env-init.sh is complete, the env-init.sh commands will then start failing as they aren't running as admin.

$ for i in {1..200}; do oc get project/openshift && break || sleep 1; done
Error from server (Forbidden): User "developer" cannot get project "openshift"
Error from server (Forbidden): User "developer" cannot get project "openshift"
Error from server (Forbidden): User "developer" cannot get project "openshift"

This is a big problem with having env-init.sh run in parallel to being able to work through the scenario. Any oc commands in env-init.sh that must run as admin may have to reference credentials in OpenShift configuration to be guaranteed they work.

@GrahamDumpleton
Copy link
Contributor

Solution may be to add to all env-init.sh right up front:

ssh root@host01 'oc adm policy add-cluster-role-to-group sudoer system:authenticated'

The commands in env-init.sh that follow can then use --as system:admin to run and work even if login as specific user has been done.

@GrahamDumpleton
Copy link
Contributor

Have tried to address issue with port forwarding scenario in #117.

It moves commands that take long time to end and uses the user impersonation to ensure that commands can still run if user logs in quickly and changes the user.

Need to go back and review all other scenarios to see what others need to change.

@chengfang
Copy link
Contributor Author

Thanks @GrahamDumpleton for the solution. I just updated java-batch-processing/env-init.sh based on PR #117 mentioned above (copying port-forwarding/env-init.sh). After that I was able to successfully run java-batch-processing scenario, and the postgresql db all works as expected.

@mhausenblas
Copy link
Contributor

So I just wanted to test it and when I go to https://katacoda.com/cfang/courses/intro-openshift/java-batch-processing it's not available anymore. Any idea @BenHall?

@BenHall
Copy link
Contributor

BenHall commented Jan 6, 2018

Looks like it's been moved to the new structure - https://katacoda.com/cfang/courses/middleware/java-batch-processing

@mhausenblas
Copy link
Contributor

OK, I just tested it successfully and only one thing to consider before we merge: in Step 2, when you say The build of our application is now scheduled, and it will take some time before it's ready you might want to give an indication of how long (given people are typically not that patient and might expect seconds when it really is more like a minute or two). This is mainly to avoid that users think something is broken or the deployment is stuck. So maybe change to up to 3min or the like, in bold?

Thanks a lot for your patience, great work!

…o 3 minutes) before the application build is ready.
@chengfang
Copy link
Contributor Author

@mhausenblas , good point. I've just made the suggested change. In my test, the build usually takes about 1.5 minutes, so "up to 3 minutes" seems a good estimate.

@mhausenblas
Copy link
Contributor

Thanks a lot @chengfang! Yeah, I guess the 3min are more on the very pessimistic side of things but if the system is very busy, many users and slow connection then it might add up to it ;)

@mhausenblas mhausenblas merged commit 24377f3 into openshift-labs:master Jan 8, 2018
BenHall pushed a commit that referenced this pull request Jan 30, 2018
…ons with WildFly on OpenShift (#62)

* initial impl of intro-katacoda/intro-openshift/java-batch-processing

* initial impl of intro-katacoda/intro-openshift/java-batch-processing

* added step to env-init.sh to pull postgresql image; updated finish.md and other chapters.

* minor updates to distinguish the web terminal and client terminal.

* quick changes

Looks good, some minor changes, along with the need for 1 more paragraph just to explain the purpose of the Spec. Go ahead and quote a nice paragraph from the spec if you want

* pls change image and some minor changes

and also making the URL nicer

* minor changes

nothing but some edits and corrections on wording

* Update 03-deploying-a-postgresql-database.md

I didn't realize they didn't use empty dir and actually used disk space in the running container

* Update 04-access-postgresql-database.md

you could make this a lot easier by copying off the terminal. I am also concerned that we are hard coding IPs but I will see what we do next

* initial impl of intro-katacoda/intro-openshift/java-batch-processing

* added step to env-init.sh to pull postgresql image; updated finish.md and other chapters.

* minor updates to distinguish the web terminal and client terminal.

* Rework the tutorial based on oc cli commands instead of OpenShift web console, with more focus on batch processing than project setup.

* fix application urls to the form: [[HOST_SUBDOMAIN]]-80-[[KATACODA_HOST]]

* add more output content after running commands; add output from running psql sql statement, and some other minor improvement.

* updated various steps based on PR comments from Graham D (Oct 16, 2017).

* improve the formatting of job xml snippets.

* change to use the official Minishift URL; add -s option to curl command to suppress curl progress info.

* move java-batch-processing course from intro-openshift directory to middleware directory.

* change the middleware/java-batch-processing/index.json pathTitle attribute value to "Building Applications on OpenShift".

* modify java-batch-processing/env-init.sh to fix the timing issue when loading from postgresql-ephemeral-template github repo (based on #108).

* modify java-batch-processing/env-init.sh to fix the ordering issue (based on #117)

* modify 02-deploying-a-new-application.md, adding estimated time (up to 3 minutes) before the application build is ready.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants