Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility of pack to create api driven projects #16

Closed
ricalanis opened this issue Apr 28, 2016 · 8 comments
Closed

Compatibility of pack to create api driven projects #16

ricalanis opened this issue Apr 28, 2016 · 8 comments

Comments

@ricalanis
Copy link

Hi there! Loved the project, this really reflects the maturity of data science projects and where we are standing. So good!

I rise this issue as I was wondering if the current structure can be adapted to an api-driven project. This is, a project in which the analysis and data flow may be related to an api definition.

If yes, what would it be? So we can document it (or point me out where it is)
If not, why? Some books have recommended having an api flow for analysis and process so our results and analysis are available for our mates in engineering. Even allowing for an easy scale up.

Thank you so much!

@ghost
Copy link

ghost commented Apr 28, 2016

@ricalanis can you give a better description of what is the flow of an API driven project?
or link to some resources about it? just to maintain a clean discussion.

@ricalanis
Copy link
Author

ricalanis commented Apr 28, 2016

Of course @MrOutis

I would say that a Data Science project is api-driven when an API is defined throughout the data science process to:

  1. Make raw and intermediate data available in an endpoint
  2. When an output is directed to an API to be interactive (From a dataviz to a simple data product)

Maybe I am skewed, but with Flask it is relatively easy to make this leap and I would like to have a clearer view on this can be included on the flow that this project proposes.

A source that covers this:
http://www.datacommunitydc.org/blog/2014/02/flask-mega-meta-tutorial-data-scientists

@isms
Copy link
Collaborator

isms commented Apr 28, 2016

So far, when setting up an API (or web app, or streaming data pipeline, microservice, etc) under this structure, we have been putting it in its own folder within src/. Let's say I was going to write a small API with Flask/Flask-RESTful. I'd create a directory for it and put my source code files in there:

/
└── src
    └── api
        ├── app.py
        └── __init__.py

Then to run it you could python src/api/app.py from the project root. You could even create a rule in the Makefile to run this, or create a Dockerfile to run it in a container — whatever entry point makes the most sense for your project.

Does that cover the use case you have in mind @ricalanis or do you think more changes are necessary to the existing structure?

@ricalanis
Copy link
Author

ricalanis commented Apr 28, 2016

Seems legit! Really covers my use case, thanks!

I wonder if such /src/api folder should be in the default folder structure, as such scripts that make your models or data discoverable can be, for me, as useful as the viz generating code. Of course one runs the risk of growing the folder structure bigger and bigger to cover such use cases. What do you guys, @MrOutis @isms , think?

@isms
Copy link
Collaborator

isms commented Apr 29, 2016

Interesting! This is not a common use case for us but we're very open to hearing more from the community.

Instead of closing this issue, I'm going to create a needs-discussion label so we can see if others are interested in providing feedback. We'll revisit the issue after some time has passed to figure out whether to close it or take action.

Sound good?

@ricalanis
Copy link
Author

Sounds perfect! That way I will be able to test it more and more and have a more solid opinion of this and how it fits on the flow that the project proposes.

@SanchitAggarwal
Copy link

Hi @MrOutis and @isms,

Just came across the directory structure and the issue. I guess @ricalanis made a point. Many times, you want your model to shift to the production environment and there you need an endpoint which can communicate with third party applications and adding /src/api folder in default structure make sense if I talk about my use case too.

@isms
Copy link
Collaborator

isms commented Feb 13, 2018

Closing but still open to hearing more about this if people have specific improvement proposals.

@isms isms closed this as completed Apr 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants