Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download raw build log #3585

Merged
merged 44 commits into from May 30, 2018
Merged

Download raw build log #3585

merged 44 commits into from May 30, 2018

Conversation

@stsewd
Copy link
Member

@stsewd stsewd commented Feb 8, 2018

This closes #2921

The new endpoint for this is /api/v2/build/{id}/log

This is where the link to download the logs will be

screenshot-2018-2-7 build-fail read the docs

This is how it looks on the browser

rawlog

And attached is a full log

def log(self, request, *args, **kwargs):
build = self.get_object()
return Response(build.raw_log)


class BuildViewSet(SettingsOverrideObject):
Copy link
Member Author

@stsewd stsewd Feb 8, 2018

Because of this custom ViewSet the detail_route is not exposed, I tried putting the method on this class unsuccessfully, the screenshots and examples were generated by adding BuildViewSetBase to the urls.

What could be a solution for this?

Copy link
Member Author

@stsewd stsewd Feb 10, 2018

I think I found a solution for this

<div data-bind="visible: finished()">
<li>
<a href="#">
{% trans "Raw log" %}
Copy link
Member

@Blendify Blendify Feb 8, 2018

"Raw log" --> "View raw"?

Copy link
Member Author

@stsewd stsewd Feb 8, 2018

Is the log word implicit here? Or what about View raw log?

@Blendify
Copy link
Member

@Blendify Blendify commented Feb 9, 2018

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 10, 2018

I did the above suggestion, solve the problem exposing the endpoint and add a test, I think this is ready for a review :)

self.assertIn(
'git checkout master\nSwitched to branch "master"',
resp.data
)
Copy link
Member Author

@stsewd stsewd Feb 10, 2018

Let me know if you want me to make a more robust test for checking the response.

Copy link
Member

@humitos humitos Feb 15, 2018

You can also add something like:

self.assertEqual(resp.data, build.raw_log)

@@ -337,6 +337,8 @@ def setUp(self):
'api_webhook_generic': {'status_code': 403},
'remoteorganization-detail': {'status_code': 404},
'remoterepository-detail': {'status_code': 404},
# This isn't a valid media type
'/api/v2/build/1/log.json': {'status_code': 404},
Copy link
Member Author

@stsewd stsewd Feb 10, 2018

For some reason the test suite insists on getting this media type for the /log endpoint.

Copy link
Member

@humitos humitos Feb 15, 2018

Why ,json? We are dealing with .txt, but I'd say that the extension is not necessary, right?

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

Yeah, this is just for implicit sending the media type rather than do it on the http headers.
This test https://github.com/rtfd/readthedocs.org/blob/761f14dfc72de7a5c9a4f4e874fda613d4548d8c/readthedocs/rtd_tests/tests/test_privacy_urls.py#L351 cheks for all endpoints reversing all posible urls (including media types explicity on the url), for the /log endpoint two urls are generated .txt and .json (but only txt media type is supported) I think is because the main media type is json, so I black listed that url.

Copy link
Member

@humitos humitos Feb 16, 2018

Mmm... Can we avoid the generation of .json since we don't need this here?

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

Well, I try with the recent change, but the CI fails :(. I think this is something related to the reversing process that I mention, should I investigate more deeper here? O leave it as is (since is related to the test suite, not the real code, I think)?

Copy link
Member

@humitos humitos left a comment

Nice work! I really liked this.

I noted some things to be cleaned up from the code.

Besides, I added other comments to discuss, like the template style.

@@ -466,6 +467,13 @@ def finished(self):
"""Return if build has a finished state."""
return self.state == BUILD_STATE_FINISHED

@property
def raw_log(self):
raw_log = loader.get_template('restapi/log.txt').render(
Copy link
Member

@humitos humitos Feb 15, 2018

Version: {{ build.version.verbose_name }} ({{ build.commit }})
Date: {{ build.date }}
Success: {{ build.success }}
RTD build information end
Copy link
Member

@humitos humitos Feb 15, 2018

I'd remove the start/end lines.

Success: {{ build.success }}
RTD build information end

{% for command in build.commands.all %}
Copy link
Member

@humitos humitos Feb 15, 2018

Are these command already sorted? What's the default sorting?

We should probably force the sorting here to avoid future problems using order_by or similar.

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

Copy link
Member

@humitos humitos Feb 16, 2018

Yes and no :)

That sorting is the default sorting and it's not the one that we want (since it will show the latest first because of the -), but on the other hand, this one applies to the Build model, not the BuildCommandResult

Anyway, I just checked this and it's what we need: sorted by start_time: https://github.com/rtfd/readthedocs.org/blob/7812234dbaecbc37e899e2ae73fce7c3de277433/readthedocs/builds/models.py#L511

So, nothing to do here I'd say

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

Yeah, that was the meta class that I saw the other time, I put the wrong model here, sorry for the confusion.

@@ -337,6 +337,8 @@ def setUp(self):
'api_webhook_generic': {'status_code': 403},
'remoteorganization-detail': {'status_code': 404},
'remoterepository-detail': {'status_code': 404},
# This isn't a valid media type
'/api/v2/build/1/log.json': {'status_code': 404},
Copy link
Member

@humitos humitos Feb 15, 2018

Why ,json? We are dealing with .txt, but I'd say that the extension is not necessary, right?

RTD build information end

{% for command in build.commands.all %}
[rtd-command-info] start-time: {{ command.start_time }}, end-time: {{ command.end_time }}, duration: {{ command.run_time }}, exit-code: {{ command.exit_code }}
Copy link
Member

@humitos humitos Feb 15, 2018

When I read this issue I was thinking just on plain simple txt file, but then you added some metadata to it (times, build id, etc). So, now I'm confused about what we want:

  • raw and simple txt plain text
  • ini style file
  • a mix of both --what you already implemented
  • json file (we already have this I think)

What do you think?

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

First I was doing just some plain commands, but them I realized that would be more helpful to have this metadata when viewing the logs, so I took some inspiration from travis https://api.travis-ci.org/v3/job/339760225/log.txt (probably I should mention this early, sorry). There is a json, but contains a lot of information and is hard to read.

Copy link
Member

@humitos humitos Feb 16, 2018

Oh! I see. It's a good idea to follow other's people ideas here.

I noticed that they also uses colored output, although not necessary, could be a good enhancement maybe later in another PR.


def render(self, data, accepted_media_type=None, renderer_context=None):
kwargs = renderer_context.get('kwargs', {})
if kwargs.get('format', self.format) == self.format:
Copy link
Member

@humitos humitos Feb 15, 2018

I think this is not necessary.

The example doesn't handle this: http://www.django-rest-framework.org/api-guide/renderers/#example

Also, take a look at the renderes that comes with DRF: https://github.com/encode/django-rest-framework/blob/master/rest_framework/renderers.py

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

I was getting a 500 error when requesting a file type distinct from txt here, so better to have a 404. I was not sure why the file type wasn't check automatically, probably something to do with the version of restframework or https://github.com/rtfd/readthedocs.org/pull/3585/files#r168669155.

Also the accepted_media_type wasn't really helpful, if I remember correctly the parameter was always None or the filetype passed was txt while the real file type requested was other (json for example).

Note the parameters are distinct from this version and from the example of the latest version.

Copy link
Member

@humitos humitos Feb 16, 2018

OK. I'd say that we need to research a little more here since I'm almost sure that this should be handled by DRF itself instead by us. Can you take a deeper look on this?

Try upgrading DRF to see if it's related to that or it's another thing. How other endpoint that only support JSON behaves when you pass format=html? Who handles that?

Those could be a good tests to have a better idea.

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

I think I figure out, the renderer also is expected to give a representation of a 404 error (any status actually).


class BuildViewSet(SettingsOverrideObject):

"""A pluggable class to allow for build cold storage."""

_default_class = BuildViewSetBase

@detail_route(renderer_classes=[PlainTextRenderer])
def log(self, request, *args, **kwargs):
Copy link
Member

@humitos humitos Feb 15, 2018

You don't need to redefine this here, but you need to move the @detail_route decorator to the definition in the Base class.

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

I try that, but was unsuccessful, here the decorator gets exposed registering the function, then SettingsOverrideObject call this function from the class on _default_class.

Copy link
Member

@humitos humitos Feb 16, 2018

Mmm... I don't know exactly how the SettingsOverrideObject works but maybe @agjohnson can help us here. Anyway, I think this is not the correct way of doing it and I'm not sure if this won't cause a problem in the .com site (the reason why we need the override object)

<div data-bind="visible: finished()">
<li>
<a href="{% url "build-log" build.pk %}">
{% trans "View raw" %}
Copy link
Member

@humitos humitos Feb 15, 2018

I think this is kind of a advanced user feature and I'd avoid exposing it to all the users since they probably don't care about this and it just adds more confusing things/elements. I mean, this is not something that you want to check/see as a final user. The log in the build page should be enough for the user and this API endpoint will be used by a script of similar by an advanced user.

On the other hand, I think this should be described in the documentation page of APIv2

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

I think this feature would be very useful for debugging failed builds, since here is more easy to read/copy the text. That's why this feature was suggested #2921. And also would be very handy for us while helping a user, since they just could given the raw log url, or the file on github as an attach or embedded. I don't think this as an advanced feature, anyone who is viewing the build page is probably because the build fail and want to know why or have a hint of what happened.

But I'm agreed that this section could be a little large, what about removing some of this info already provided by the raw log?

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

If I search correctly, the docs for apiv2 are auto generated https://readthedocs.org/api/v2/

Copy link
Member

@humitos humitos Feb 16, 2018

I would remove this for all the user and make that link available only if is_staff (so, core developers can easily access while debugging). As I said, if the user wants to take a look at the log, can use the log that it's already displayed in a friendly way in the same page.

In the issue it's mentioned "easy to grep", which I agree and I don't think that final user will take a look at the log very often to make it part of the UX (considering that it's already ower loaded in that section)

self.assertIn(
'git checkout master\nSwitched to branch "master"',
resp.data
)
Copy link
Member

@humitos humitos Feb 15, 2018

You can also add something like:

self.assertEqual(resp.data, build.raw_log)

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 16, 2018

@humitos I answered to all your questions and apply some of your suggestions, sorry for not bean clear at the beginning why some decisions were made, hope now is all clear :).

@@ -1,10 +1,9 @@
RTD build information start
RTD build information
Build: {{ build.pk }}
Project: {{ build.project.slug }}
Version: {{ build.version.verbose_name }} ({{ build.commit }})
Date: {{ build.date }}
Success: {{ build.success }}
Copy link
Member

@humitos humitos Feb 16, 2018

What happen if you access this endpoint while the build is still buidling? Should this be allowed? Should we show the status here also?

Copy link
Member

@humitos humitos Feb 16, 2018

and do not show the Success at while builing?

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

If you enter while building you get the log whit the current commands that were executed. I don't know if we want to consider this case so much, since the url to view the raw log is displayed only when the build is finished. But if someone (using the api) wants to access to this endpoint, probably should check the current status with the other endpoint, what you think? I like the idea of not showing the success text, but I feel like we should display something else then.

Copy link
Member

@humitos humitos Feb 19, 2018

This is an endpoint, so this will more useful to hit from a script instead from a web browser.

So, I suggest to show the status of build but also the success when it's finished.

@@ -337,6 +337,8 @@ def setUp(self):
'api_webhook_generic': {'status_code': 403},
'remoteorganization-detail': {'status_code': 404},
'remoterepository-detail': {'status_code': 404},
# This isn't a valid media type
'/api/v2/build/1/log.json': {'status_code': 404},
Copy link
Member

@humitos humitos Feb 16, 2018

Mmm... Can we avoid the generation of .json since we don't need this here?

@humitos
Copy link
Member

@humitos humitos commented Feb 16, 2018

Also, I just realized something here: we already have this endpoint, https://readthedocs.org/api/v2/build/6680427/ that returns exactly the same that we need here but it's JSON formatted.

So, why not just using a different render when ?format=txt instead of creating a new endpoint? Like

https://readthedocs.org/api/v2/build/6680427/?format=txt

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 16, 2018

@humitos I remember that I consider that option, but doesn't feel right at the beginning, since we need to edit the custom renderer and this will only works for this case no for a general media type. I will push some commits with that change to discuss here and if isn't ok will revert those commits.

if response.status_code != status.HTTP_200_OK:
return data['detail'].encode(self.charset)
data = render_to_string(
'restapi/log.txt', {'build': data}
Copy link
Member Author

@stsewd stsewd Feb 16, 2018

@humitos for reusing the same endpoint we need to render the info from here (although we could just take the build id from the data and get the raw_log property, but I think that's unnecessary)

Some comments if we choose this path:

  • the raw_log property from the build has no relevance now
  • We should rename the renderer class to something like PlainTextBuildRenderer

Copy link
Member Author

@stsewd stsewd Feb 16, 2018

Also some other parts will need to be removed, but basically this is the core, let me know if you want me to go in this direction.

Copy link
Member

@humitos humitos Feb 19, 2018

It seems that this is the way to go. I think it's clearer to have just one endpoint for this that renders in a different format depending on the extension/format attr.

Removing the property and renaming the renderer make sense to me.

Copy link
Member

@humitos humitos left a comment

I think this is the path to follow. Seems cleaner.

@@ -1,10 +1,9 @@
RTD build information start
RTD build information
Build: {{ build.pk }}
Project: {{ build.project.slug }}
Version: {{ build.version.verbose_name }} ({{ build.commit }})
Date: {{ build.date }}
Success: {{ build.success }}
Copy link
Member

@humitos humitos Feb 19, 2018

This is an endpoint, so this will more useful to hit from a script instead from a web browser.

So, I suggest to show the status of build but also the success when it's finished.

return data.encode(self.charset)
else:
raise Http404
response = renderer_context.get('response')
Copy link
Member

@humitos humitos Feb 19, 2018

If render_context is None (the default) this will fail.

Besides, I don't understand how it works. Why sometime we can have the data already generated and why other times we need to render_to_string? I'm not following this.

Copy link
Member Author

@stsewd stsewd Feb 19, 2018

When the response status is 200, we have the data from the model, but when the status is a error, we have a single dict {"detail": "explanation"'}. I think this is done in this way because the renderer only represent the data in another way, but here we are using the renderer for a specific case.

I'm not sure about the None value, but looking at the code of others renderes, this is a common pattern renderer_context = renderer_context or {}

Copy link
Member

@humitos humitos Feb 20, 2018

Maybe here is better to use response.exception as https://github.com/encode/django-rest-framework/blob/master/rest_framework/renderers.py#L165

instead of checking by the status code. Seems more readable.

if response.status_code != status.HTTP_200_OK:
return data['detail'].encode(self.charset)
data = render_to_string(
'restapi/log.txt', {'build': data}
Copy link
Member

@humitos humitos Feb 19, 2018

It seems that this is the way to go. I think it's clearer to have just one endpoint for this that renders in a different format depending on the extension/format attr.

Removing the property and renaming the renderer make sense to me.

Version: {{ build.version.verbose_name }} ({{ build.commit }})
Date: {{ build.date }}
State: {{ build.state }}
{% if build.state == 'finished' %}Success: {{ build.success }}{% endif %}
Copy link
Member Author

@stsewd stsewd Feb 20, 2018

I have to put this in one line, otherwise django inserts a newline :/

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 20, 2018

@humitos I have updated the PR with the new endpoint and checked for the None value.

def render(self, data, accepted_media_type=None, renderer_context=None):
renderer_context = renderer_context or {}
response = renderer_context.get('response')
if response.status_code != status.HTTP_200_OK:
Copy link
Member Author

@stsewd stsewd Feb 20, 2018

Still here can be None, I better deep in the docs to see when renderer_context is None

Copy link
Member Author

@stsewd stsewd Feb 20, 2018

From the docs http://www.django-rest-framework.org/api-guide/renderers/#renderer_contextnone

Looks like this value is always passed here.

self.assertIn('RTD build information', resp.data)
self.assertIn('[rtd-command-info]', resp.data)
self.assertIn(b'RTD build information', resp.content)
self.assertIn(b'[rtd-command-info]', resp.content)
Copy link
Member

@humitos humitos Feb 20, 2018

Instead of using binary check here, you should decode the content since you want to compare string not data.

Copy link
Member

@humitos humitos left a comment

A couple of minor changes. After those one, I'd say it's ready to be merged.

def render(self, data, accepted_media_type=None, renderer_context=None):
renderer_context = renderer_context or {}
response = renderer_context.get('response')
if not response or response.exception:
Copy link
Member Author

@stsewd stsewd Feb 20, 2018

If there is no response probably is something wrong, anyway, this is just an extra protection since renderer_context "always" has this.

resp.content.decode()
)

def test_get_invalid_raw_log(self):
Copy link
Member Author

@stsewd stsewd Feb 20, 2018

One extra test to see if the response.exception works :)

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 20, 2018

@humitos done! thanks for the review!

Copy link
Member

@humitos humitos left a comment

Good Job!

This PR is ready to merge for me.

RTD build information
Build: {{ build.pk }}
Project: {{ build.project.slug }}
Version: {{ build.version.verbose_name }} ({{ build.commit }})
Copy link
Member Author

@stsewd stsewd Feb 21, 2018

@humitos I just realized that here we don't have the project slug and version verbose name

Copy link
Member Author

@stsewd stsewd Feb 21, 2018

We only have the project id and commit

Copy link
Member

@humitos humitos Feb 22, 2018

Why? Take a look at your first screenshot and log attached: there were present.

We should have that information since we pass a build object to the template and the build is associated to a project, so we should be able to get that information (which is important).

Copy link
Member Author

@stsewd stsewd Feb 22, 2018

Because before this template was called from the Build object (raw_log property), but now, we uses the data variable, which contains the data specified on the serializer. We could get that info again by retrieving the object from the id given by the serializer, but I'm not sure if that's ok.

Copy link
Member

@ericholscher ericholscher May 22, 2018

I think making a new query is fine. This won't be a heavily hit API or anything, so I'm not too worried about performance.

@stsewd
Copy link
Member Author

@stsewd stsewd commented Feb 21, 2018

This is how a full log looks now

log.txt

Build: {{ build.id }}
Project: {{ build.project }}
Version: {{ build.commit }}
Date: {{ build.date }}
Copy link
Member Author

@stsewd stsewd Feb 21, 2018

Also, is the default format for dates fine?

Copy link
Member

@humitos humitos Feb 22, 2018

Yes! :)

Copy link
Member

@humitos humitos Feb 22, 2018

@stsewd stsewd force-pushed the download-raw-log branch from 3557759 to 86d7bde May 24, 2018
@stsewd
Copy link
Member Author

@stsewd stsewd commented May 24, 2018

This is how it looks now

screenshot-2018-5-23 a-repo-tes read the docs

146.txt

screenshot-2018-5-23 a-repo-tes read the docs 1

142.txt

@@ -100,6 +100,8 @@ class BuildSerializer(serializers.ModelSerializer):
"""Build serializer for user display, doesn't display internal fields."""

commands = BuildCommandSerializer(many=True, read_only=True)
project_slug = serializers.ReadOnlyField(source='project.slug')
version_slug = serializers.ReadOnlyField(source='version.slug')
Copy link
Member Author

@stsewd stsewd May 24, 2018

I needed to add this new fields here since we can't access this inside the template as an object but as the result of the serializer.

@ericholscher ericholscher merged commit cefb86f into readthedocs:master May 30, 2018
1 check passed
@stsewd stsewd deleted the download-raw-log branch May 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

5 participants