Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAMOA-59: add an adapter for Apache Gearpump #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gy910210
Copy link

Hi,

This PR is referenced to SAMOA-59. And how to execute SAMOA with Apache Gearpump, you can follow the README file.

Thanks.

@gy910210 gy910210 changed the title SAMOA-59, add an adapter for Apache Gearpump SAMOA-59: add an adapter for Apache Gearpump Apr 14, 2016
pom.xml Outdated
@@ -127,6 +137,9 @@
<miniball.version>1.0.3</miniball.version>
<samza.version>0.7.0</samza.version>
<flink.version>0.10.1</flink.version>
<gearpump.version>0.7.5</gearpump.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the latest 0.8.0 version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll bump it to 0.8.0

@gdfm
Copy link
Contributor

gdfm commented Jun 6, 2016

Thanks for your work @pangolulu @manuzhang.
The code looks quite neat.
However I didn't manage to run SAMOA on Gearpump by following the tutorial.
The topology starts but it doesn't produce any output.
I tried with the latest version (0.8.0), probably I should have used an older version?

@manuzhang
Copy link
Contributor

@gdfm I upgraded gearpump version to 0.8.1-SNAPSHOT which includes critical bug fixes and package renaming although that requires manual build. Hopefully we'll release gearpump 0.8.1 soon.

@nicolas-kourtellis
Copy link

@manuzhang was there any update from your end on the version of gearpump that we can test Samoa?

@manuzhang
Copy link
Contributor

@nicolas-kourtellis updated to latest release 0.8.1. Note this is a source release so you have to build gearpump from source.

@manuzhang
Copy link
Contributor

@nicolas-kourtellis any updates on this PR ?

@nicolas-kourtellis
Copy link

Hi manuzhang,
I was trying to test the adapter in local mode but realized that I am having difficulties compiling the gearpump 0.8.1 from source. I installed scala, sbt, etc., following the instructions I found here
http://gearpump.incubator.apache.org/releases/latest/get-gearpump-distribution.html
but it keeps complaining for some things and the build fails. Is it possible you make available the binary version of 0.8.1 so that I can use it directly? Or are you guys planning to make it available for the public soon?

Thanks!

Nicolas

@manuzhang
Copy link
Contributor

@nicolas-kourtellis We haven't made a binary release yet so the binary for 0.8.1 is temporarily hosted at our old github repo https://github.com/gearpump/gearpump/releases/tag/0.8.1

@nicolas-kourtellis
Copy link

Hi @manuzhang,

I managed to get the adapter working. Here are some notes that I would ask you take into consideration:

  • There are some inherent difficulties compiling gearpump from source. It would be good to have a compiled version to use directly.
  • Assuming this is given (which was my case because @manuzhang provided a compiled version), I managed to get samoa to compile/package with gearpump and run the package.
  • However, it would be good for the adapter to be upgraded to the new version of samoa in incubation, which is 0.5.0. But it should be fairly straightforward. This will allow us to test it with some more generators and ML methods added in the recent past.
  • Feedback when executing VHT:
    => The engine seems to continue executing the topology long after it has been created, used for the task and finished. Is there any way to pass a signal at the end of the execution to shut it down? (note: not the engine itself, but the topology). It was occupying resources on my computer for no reason at full CPU consumption. I found a manual way to kill it using the command "gear kill -appid X" with X being the id of the task, but I wonder if there is a more automatic way.
    => After I killed the jobs manually, the java processes that were created for the execution (I will assume they are the containers of the topologies) were still alive, just not consuming much resources. Shouldn't they have been terminated and removed? Is there a way to do that?
    => When I run new tasks, they just keep getting added on the engine (which is logical), even though I had killed the other ones earlier.
    => Multiple executions of the same experiment with the same seed for the random generator using the parameter -r which should yield the same random tree, perform differently with respect to accuracy. Is that expected?
    => Using a different seed for the random tree generator (r=1,...,5), the performance of the execution of VHT on local GearPump is fairly low (average over 5 different seeds: 65.39% accuracy) in comparison to running the topology on local Storm (84.046% accuracy). Any explanation why so much reduction in performance?

@manuzhang
Copy link
Contributor

@nicolas-kourtellis Thanks for the detailed review. I'll look into each item and get back to you.

@abifet
Copy link
Contributor

abifet commented Aug 14, 2019

@manuzhang We are planning a new release, any news on this?

@manuzhang
Copy link
Contributor

@abifet It's good to see this project going again. Since it's been quite a while, it will take some time for me to redo the PR so this will probably miss the release.

@abifet
Copy link
Contributor

abifet commented Aug 14, 2019

Thanks @manuzhang! No problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants