Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Book POST functionality #62

Merged
merged 48 commits into from
Feb 26, 2018
Merged

Book POST functionality #62

merged 48 commits into from
Feb 26, 2018

Conversation

bdr99
Copy link
Contributor

@bdr99 bdr99 commented Jan 31, 2018

This PR implements the functionality for POSTing new ebooks to the database.

To add books, send an HTTP POST request to /books/post/[secret]. The secret is read from a file called book-post-secret in the project root. The body of the POST request should be the YAML data for the book. If the book was successfully added to the database, the server will respond with a 200 status code.

To add books, send an HTTP POST request to /books/post/. The request should contain an HTTP header X-Gitenberg-Secret that contains the secret key. On the server side, this secret is read from an environment variable (GITENBERG_SECRET). The body of the POST request should be the YAML data for the book. If the book was successfully added to the database, the server will respond with a 200 status code.

Copy link
Contributor

@eshellman eshellman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also may want to run pylint

from gitensite.apps.bookinfo.views import all_repos_txt
from gitensite.apps.bookinfo.views import metadata

#The secret must be stored in a file called "book-post-secret" located at the project root
secretfile = open("book-post-secret", "rw")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preferred way to pass secrets to the server on elastic beanstalk is via environment variables. look at how secrets are passed into the settings files. You can't just go in and write files!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we will adjust the code to read the secret from an environment variable rather than from a file.

@@ -25,6 +30,7 @@
url(r'^all_repos.txt$', all_repos_txt, name='all_repos.txt'),
url(r'^$', HomePageView.as_view(), name='home'),
url(r'^books/(?P<book_id>\d+)\.(?P<ext>json|yaml)$', metadata, name='metadata'),
url(r'^books/post/' + secret, BookPostView.as_view(), name='book-post')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass the secret into to view and let the view validate it. that way you can can get fancier later, or use the auth classes to handle multiple users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we will try to restructure the code to work this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to changing the secret to be read from an environment variable, we also changed the secret to be passed to the view through an HTTP header.

@@ -38,3 +43,15 @@ def get_queryset(self):
return super(AjaxListView,self).get_queryset().filter(title__icontains=self.request.GET['q'])
else:
return super(AjaxListView,self).get_queryset()

class BookPostView(TemplateView):
@method_decorator(csrf_exempt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with method_decorator. usually I decorate a view directly in urls.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to use the method decorator in order to disable Django's CSRF protection feature which was preventing us from implementing this POST endpoint. It is my understanding that the CSRF feature is meant to be used with HTML forms, so I don't think we need it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct; what I meant was that you can use csrf_exempt as a decorator directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example of what decorating a view in urls.py would look like?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in https://github.com/EbookFoundation/regluit/blob/master/frontend/urls.py#L108
csrf_exempt is in urls.py because the view is defined by an inherited method of a view class

in https://github.com/EbookFoundation/regluit/blob/master/api/views.py#L105
csrf_exempt decorates the view directly where it is defined

PyYAML.add_multi_constructor('!lcsh', default_ctor)

def addBookFromYaml(yaml):
obj = PyYAML.load(yaml)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not safe_load() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, was not aware of safe_load. We will change this.

def addBookFromYaml(yaml):
obj = PyYAML.load(yaml)

(book,created) = Book.objects.get_or_create(book_id=int(obj['identifiers']['gutenberg']), repo_name=obj['_repo'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

book_id is declared unique, so this will throw exceptions if you have two repos with same book_id and different repo_name. Unfortunately these exist; also a problem if repo_names change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the warning. I guess we will have to find a different approach for the id. Maybe we will just use a numeric ID that increments each time a book is added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need a different approach, I think all you need to do is drop the unique declaration.

@eshellman
Copy link
Contributor

eshellman commented Feb 5, 2018

I recommend reading this: https://simpleisbetterthancomplex.com/tutorial/2016/10/31/how-to-handle-github-webhooks-using-django.html

No shame in copy and paste.

@eshellman
Copy link
Contributor

One thing I wish I'd done more is check my code style against a linter, like pylint. Now that I run pylint regularly, I find that it catches many of my coding errors and bad habits (such as having way too much code in a module.)

@bdr99
Copy link
Contributor Author

bdr99 commented Feb 7, 2018

@eshellman Thanks for the link. We ended up using an approach similar to the one described on that page, in that the secret is passed through an HTTP header. I will edit the top comment to describe the new method.

@@ -0,0 +1,32 @@
# -*- coding: utf-8 -*-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you squash these migrations into one?
to do this:

  1. migrate back to 0002
  2. delete the 3 migrations (and the .pyc files!)
  3. makemigrations
  4. migrate

@@ -9,8 +9,8 @@
{% if book.cover_url %}<img src="{{ book.cover_url }}" />{% else %}<i class="fa fa-5x fa-book"></i>{% endif %}
</div>
<div class="large-3 small-6 columns">
<h5 class="booktitle" style="font-weight: bold;">{{ book.title_short }}</h5>
<p class="bookauthor">{{ book.author }}</p>
<h5 class="booktitle" style="font-weight: bold;"><a href="/book/{{ book.book_id }}">{{ book.title_short }}</a></h5>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to use {% url %} tag?

<div class="row">
{% for sameauthor_book in sameauthor %}
<div class="large-2 small-4 columns end">
<a href='/book/{{sameauthor_book.book_id}}'>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use url tag?

@@ -25,3 +28,48 @@ def test_view_by_anonymous(self):
self.assertEqual(r.status_code, 200)
r = anon_client.get("/license/", follow=True)
self.assertEqual(r.status_code, 200)

#Test to ensure that secret is in environment variables
self.assertEqual("GITENBERG_SECRET" in os.environ, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are you going to get "GITENBERG_SECRET" into the test environment?

rights: Public domain in the USA.
rights_url: http://creativecommons.org/about/pdm
subjects:
- '!lcsh: Love stories'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the wrong syntax.
wrong
- '!lcsh: Love stories'
right
- !lcsh: 'Love stories'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is copied from a gitberg test. Let me double check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata=BookMetadata(book,rdf_library=rdf_library, enrich=should_enrich)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like the following code should duplicate the code that handles a post. so maybe better to put it on the model so both management command and post handler can call book.load_metadata(metadata)

@bdr99
Copy link
Contributor Author

bdr99 commented Feb 21, 2018

OK, we have addressed the following:

  • Travis test should not require secret
  • use url tags
  • combine migrations into one
  • remove duplicated code in load_repos management command and db.py

@eshellman
Copy link
Contributor

looks like tests fail because of travis not knowing the secret

@eshellman
Copy link
Contributor

No, that's not it; a 405 means the server won't accept the POST. I bet it's related to the way you tried to do csrf_exempt; try wrapping the as_view method in urls.py

@bdr99
Copy link
Contributor Author

bdr99 commented Feb 26, 2018

I got the tests working. The problem was that when the test client sent the HTTP POST request, it was being redirected because of the security settings in settings.py. And HTTP POST requests are not preserved when following redirects, so it was being changed to a GET request, which caused the 405 failure (Method not Allowed). I changed the test to use a secure request and it works now.

@eshellman eshellman merged commit 0045ced into master Feb 26, 2018
@eshellman
Copy link
Contributor

Deployed on production. Now add the secret into the environment and you're ready to start posting books.

@eshellman
Copy link
Contributor

or maybe the load_repos command needs running. let me know.

@bdr99
Copy link
Contributor Author

bdr99 commented Feb 28, 2018

Yes, the load_repos command needs to be run so that the data can be loaded into the new schema.

Edit: It would be a good idea to merge #63 before doing this though, as we fixed a bug about loading the repo name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants