METRON-671: Refactor existing Ansible deployment to use Ambari MPack #436
Conversation
beec0a6
to
8b3d092
Compare
@@ -1,35 +0,0 @@ | |||
Kibana 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this documented anywhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is what documented? I just removed the role as it's deprecated with the MPack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file I commented was the readme that documented how to add or modify the kibana page template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it is, but not well. I'll make sure I get it as part of this effort.
@@ -15,4 +15,4 @@ | |||
# limitations under the License. | |||
# | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to go to the latest ( _121 ) if we are going to update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used Ambari's default so I don't have to worry about any introduced non-compatibility.
How are you running the playbook? I cannot get it to execute |
I am, for both full dev and ec2. What are you trying to do? |
I want to run the playbook to just build the rpm's, not deploy. So just metron_build. |
Yes, that works on my rig. Can you tell me a bit more about what you're experiencing? |
can you share your command line? Do you run it from /playbooks? do you -i an inventory? Did you copy or create an ansible.cfg? |
Sure. It's vagrant up or ./run.sh. :) I run it as part of running full dev or ec2. What error are you getting? Btw, to only build the rpms, you may can still do an mvn package -DskipTests -Pbuild-rpms from maven-deploy. |
I need to run the playbook though. |
ansible-playbook -v playbooks/metron_build.yml < PLAY >
skipping: no hosts matched < PLAY RECAP >
|
Do I have to create an inventory with my local machine name? |
Yeah, that seems like a good thing to try. |
ansible-playbook -v -i "localhost," -c local playbooks/metron_build.yml now it is running and i'll see about the errors |
Maybe we should add that line to the doc or create a script?? |
No. It's not meant for that. If you want to build standalone, Maven is the supported way. |
ok - when you review what I'm doing in the playbook/role we can talk about alternatives |
Sounds good. The important thing to remember is the overall goal of reducing reliance on Ansible. I had to add a build task because installation will fail without it, but I fully expect that task to disappear sooner rather than later, so adding additional dependencies on it will need to be scrutinized carefully. |
Well, if we don't want to put deployment things with the src, then ansible is a more flexible and easier to use tool for certain tasks too. But we will talk about it if I get it working |
@dlyle65535 , after fetch and merge of your latest I can no longer vagrant up full_dev ==> node1: Adding box 'new_base' (v0) for provider: virtualbox Couldn't open file /Users/dml/projects/metron-dlyle/metron-deployment/packer-build/builds/base-centos-6.7-2.1.20170303223924.git.33abe8cf13c347a2dfdece145a7b8c17f2a423c0_dirty.virtualbox.box |
I have been able to launch "Quick Dev" with deployment report. Thanks for the fix @dlyle65535 I have been fighting a bit with the AWS deployment. I ran into two issues. (1) On one pass the setup of Ambari seems to fail, but the deployment continued, which causes it to fail later on in the deployment. To fix, I manually logged into the host and ran the Ambari setup and then re-ran the deployment which addressed the problem. I am almost certain that I have seen this before prior to the work in this PR.
(2) The second issue was more unexpected. On all but one of the 10 AWS nodes, the deployment went smoothly. At some point during the deployment, Ansible could not talk to one node, but it continued on anyways. After the 9 were finished, Ambari showed all 10 nodes, except the one, which it showed in yellow indicating that it could not get a heartbeat. After Ansible was done with the 9 nodes, it then seemed to almost start over on the last node. It went and rebuilt the source code, pushed out the RPMs, reinstalled the MPack, etc. That really confused the cluster and it has not processed any data. I'm sure a little manual effort could fix-up the cluster, but the behavior of Ansible was weird. Before when I've worked with the AWS deployment, it would fail if any one node failed. Now it seems to retry failed nodes at a later point in time, which has some negative implications when we expect actions like the build, mpack install, etc to only occur once. Not sure what to make of this issue. |
Travis failure. Upsetting. This came in with my latest Master merge. Don't fail locally with Travis command line. Log below. It's in unrelated code. Any ideas?
|
@nickwallen - I had run EC2 testing a bunch and it worked post at least as well as it did prior (sometimes AWS zigs when it should zag). But, I have made a quite few changes since my last EC2 run. I'll spin it up and see if I get it too. |
@nickwallen - For your first issue, I think we're hitting a transient issue with Ambari where ambari-setup -s completes successfully but Ambari won't actually start. I'll see if I can get some better diagnosis. In other news- good news, bad news. Good news: I am able to replicate the integration test failure by running them in my local environment. Bad news: it's not in the code I touched and I'm completely flummoxed. Help would be much appreciated. Once I can get past these EC2 issues, I can diff master, but like I said. Help? Appreciated. |
Good news, I think I found the issue with the failing tests. The Maven reported "duplicated" dependencies weren't. I've replaced them. Travis will tell. @nickwallen - I did see the error you're talking about in your first point above. I think your memory is correct, it's one of those transients that we see sometimes. There's not much that can be done, but I am testing a patch that looks for "FATAL" in the ambari-setup stdout so at least we'll fail where the problem occurs. |
Don't run quick_dev role on ec2 buildout
@nickwallen - I pushed up a changeset that will address both your points. For 1, I added a test to ambari-setup to fail if FATAL appears in stdout. Making sure the EC2 build doesn't run the quick_dev role addresses the second. |
@justinleet - that last commit adds the quotation requiremnt to the tool tip. |
@dlyle65535 METRON-745 is in (as I'm sure you can tell from the conflict list). I already incorporated the Kibana map changes, so you should just be able to accept master's version. |
@justinleet - Accepted. Thanks. |
@dlyle65535 fyi I'm deploying to ec2 right now. I'll update shortly. |
@dlyle65535 is the test for EC2 to just verify everything spins up as normal? Any additional specific items to test or smoke test? |
@mmiklavc - Exactly. We were wanting to make sure that @nickwallen was having environmental issues. |
Thanks @mmiklavc! |
That's great, @mmiklavc. I am also a +1. I was able to test this successfully on Quick Dev and Full Dev. |
Thanks for all the help! I intend to merge this in tomorrow afternoon. @ottobackwards, @justinleet, I wanted to make sure you two were good. I also want to make sure there's no other feedback I've missed. |
I am +1. I have been working downstream and building off of this for a bit, and been able to get quick and full up with everything started. I have some things that I would like to improve on, but like @dlyle65535 says, this is step 2 of many |
I'm +1. I was just waiting for the EC2 component, but was able to get quick-dev, etc. spun up without issue. |
…dlyle via justinleet) closes apache/metron#436
…dlyle via justinleet) closes apache/metron#436
…dlyle via justinleet) closes apache/metron#436
…dlyle via justinleet) closes apache/metron#436
Updated Update - I think this is ready for review.
Update - Documentation pending, this is pretty much ready to go and could use some eyes on. The major change is that you'll need Docker (Docker for Mac on Macs) running for the build to complete. This is because the MPack requires the RPMs be built.
To test, run Quick Dev or Full Dev or both.
I'll be working on the documentation in the next day or so.
This is the first set of changes to enable Ansible installation to use the Ambari MPack. It currently works (in my environment) with full-dev using sensor-stubs.
It will not work with Quick-Dev or EC2 at this time. Update: All environments worked in my testing.It's (well past - sorry) starting to get large, so I wanted to push it out for feedback while I'm working issues with the distributed install.
Some points of interest:
Immediate next steps:
I also hope to receive and respond to feedback.