Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate framework or routing strategy for the API endpoints #3802

Closed
ph opened this issue Aug 26, 2015 · 21 comments
Closed

Evaluate framework or routing strategy for the API endpoints #3802

ph opened this issue Aug 26, 2015 · 21 comments

Comments

@ph
Copy link
Contributor

ph commented Aug 26, 2015

In the past (logstash 1.4.2) we were using Sinatra to serve the kibana javascript code. Since the API will need to answer multiples web calls, we should use see if we can leverage existing library to speed up the development process.

(non-exhaustive list of libraries)

All the previous libraries support basic CRUD operations. But I think we should investigate if in a near future we will need to support websocket or long polling stream. Both of theses technologies could be awesome to provide a step debugger or display live stats on a marvel like dashboard.

The minimal requirements for them are:

  • Routing handling
  • Params handling and or sanitization
  • request/response content type negotiation
  • Rack based (which we will need to hook into puma)
  • Easy to test.

Ref: #2611

@suyograo
Copy link
Contributor

I thought we agreed on using Puma for serving http requests. @jsvd has already used this in context of logstash-input-http with monitoring and manageability APIs in mind

@ph
Copy link
Contributor Author

ph commented Aug 27, 2015

Puma is the webserver I don't think we will move away from it, the libraries I am talking about are lightweight MVC framework

@purbon
Copy link
Contributor

purbon commented Aug 27, 2015

Not sure how confortable you are with it, but I do really like Padrinorb -> http://www.padrinorb.com/ ... been for me something like a Sinatra on asteroids 😸

/cheers

@ph
Copy link
Contributor Author

ph commented Aug 27, 2015

I know about Padrino and I have also used it in past projects.

And to be totally honest, I have forgotten its existence.. probably because I didn't find it was bringing that much to the table..

I think for the API we need something lightweight, easy and customizable.
I believe that outside of the routing, params handling and rack support we don't need that much?

@ph
Copy link
Contributor Author

ph commented Aug 27, 2015

I have updated the issue with a few requirements I had in mind.

@suyograo just to clarify, Puma is a webserver supporting rack which is library / specification for writing frameworks in ruby. (similar to wsgi in python world or ring in clojure)

In the HTTP input we are using Rack directly to talk to to puma, (IIRC the communication format is an array like this [headers, response_code, body]), in the context of the input it make sense to use rack directly to serve responses because we only answer to one route.

In the context of the API, it more manageable to use a lightweight library to hide that complexity instead of implementing a custom routing.

Instead of switch cases or regexp we only have to write this.

# Sinatra
get "/pipeline/:name/" do |name|
  Pipeline.fetch(name)
end

@ph
Copy link
Contributor Author

ph commented Sep 8, 2015

I did a bit of experimentation and this is what I have come:

  • Rack is too close to the metal for our need.
  • cuba is a weird side on the syntax and not OOP friendly.
  • lotus doesn't work without 2.0 syntax (Sadly this is the nicest code of the other)

So I'll stay with sinatra

@ph ph added the monitoring label Sep 9, 2015
@ph ph changed the title Metrics: Evaluate framework or routing strategy for the API endpoints Evaluate framework or routing strategy for the API endpoints Sep 11, 2015
@ph ph added v2.1.0 and removed v2.0.0 2.0 labels Sep 11, 2015
@guyboertje
Copy link
Contributor

What to use as the intermediate datastore?
metrics_pipeline push -> ES -> pull sinatra
metrics_pipeline push -> embedded java k/v store -> pull sinatra
metrics_pipeline push -> memcache -> pull sinatra
metrics_pipeline push -> redis -> pull sinatra
metrics_pipeline -> pull sinatra

@ph your code example seems to indicate that the pipeline not only acts as the metric events receiver but also the in-memory key/value store and the aggregation processor.

@guyboertje
Copy link
Contributor

@ph - I know your code above is an example. I just wanted to expand on the role(s) that this component will perform.

@colinsurprenant
Copy link
Contributor

I have successfully used Grape in the past and thought it was a good lightweight+flexible balance specifically for APIs.

http://intridea.github.io/grape/
https://github.com/ruby-grape/grape

@ph
Copy link
Contributor Author

ph commented Sep 15, 2015

@guyboertje Yeah, sorry I should have expanded that. I should say It depends on the metrics.
Most of the metrics will be consumed not from a call to Logstash but by using a kibana dashboard, so it makes sense to PUSH to ES.

Maybe there is value at exposing some small metrics directly in the Logstash api, if its the case I think it would be push to a in memory store at first and a pull from sinatra but we will keep a relatively small windows. (Last X minutes).

Also, I believe some calls will need be sync like the fetching the configuration for a specific pipeline/plugin or getting the current hot_threads #3909, since they don't fit well in a push/pull models.

@colinsurprenant I had great success with grape but always forget about it! Added to the list.

@colinsurprenant
Copy link
Contributor

I am not sure I understand the concept of using something else than what we have in place already to support metrics push, like having a logstash-input-metrics and using logstash-output-elasticsearch?

@guyboertje
Copy link
Contributor

@colinsurprenant - For the setup you describe where is the API getting its data from, ES?

So if ES available, then kv store == ES else kv store == ?

Could we chop up the config and push the parts to the kv store initially or at config reload?
Will we be able to read JMX info at the time of receiving a metric (perhaps, though, not for every metric) and augment the metric with thread and memory information? In other words, how important is knowing the thread/memory information at an arbitrarily time or it being related to a particular metric?

@colinsurprenant
Copy link
Contributor

I am obviously missing something here. the way I see it, metrics are volatile in logstash and collected when starting logstash and queryable through an api. for persistence they should just be dumped in an external storage (ES) using logstash own architecture (pipeline + plugins) and possibly supporting multiple pipelines if we don't want to interfere with the measured pipeline?

@ph
Copy link
Contributor Author

ph commented Sep 15, 2015

@guyboertje @colinsurprenant lets focus on the shutdown related stuff since its our priority and make a zoom session after. I am under the impression that we are all saying the same thing but missing context in the comments.

@colinsurprenant
Copy link
Contributor

+1 zoom sometime to sync on the vision
-1 on preventing discussion on design issues not directly related to whatever day's priority we are on.

@suyograo suyograo added v5.0.0 and removed v2.1.0 labels Oct 20, 2015
@ph ph assigned purbon Dec 7, 2015
@purbon
Copy link
Contributor

purbon commented Dec 9, 2015

Hi,
in my experience, I agree with @ph, sinatra is simple and versatile enough to be a good api supporting api (as it has been for me always). This does not mean the others are bad, but if done in an organized way I think sinatra is a fair choice.

@purbon
Copy link
Contributor

purbon commented Dec 9, 2015

I see the discussion here jumped into design a lot, is there an issue where we can focus this discussion on? and might be keep this issue only about frameworks?

@ph
Copy link
Contributor Author

ph commented Dec 9, 2015

@purbon + 1 Let keep this issue about framework and we can create a new issue about the design/architecture.

@ph
Copy link
Contributor Author

ph commented Jan 12, 2016

@purbon can you update this issue with was was done in the branch?

@purbon
Copy link
Contributor

purbon commented Jan 12, 2016

@ph sure

@purbon
Copy link
Contributor

purbon commented Jan 12, 2016

We finally went for using Sinatra as the most simple and versatile framework, also the one I had the most experience with. Sinatra fits all the necessary requirements and should now be an impediment for an api design point of view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants