Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Hooks enhancement: running real file in place + context serialization #694

Closed

Conversation

eviweb
Copy link
Contributor

@eviweb eviweb commented May 3, 2016

hi,
here is an enhancement of how hooks work.
They now receive through the standard input stream, the cookiecutter context object, serialized in JSON .
This allows to unserialize this object, and then simply read and/or update some values that can be sent back to the main context, by writing a JSON serialized object to the standard output.

It is also possible to disable the hook duplication by adding the setting "_no_hookcopy": "yes" in the cookiecutter.json. Hooks are then run directly in place.
This should be useful, when a template author does not need to use template variables directly in his/her hooks and want to keep track of the template directory.
A concrete example is license management. In the documentation - Choice Variables (1.1+) - it is explained how to manage licenses in one file.
It should be easier to get many license file templates in a directory, and then just copy the selected one.
More information and examples are available in the updated Advanced usage page of the documentation of this PR.
This is the start of a proposition and I'm very interesting by getting some feedbacks.

Best regards

Eric

@eviweb
Copy link
Contributor Author

eviweb commented May 3, 2016

And default behaviours are preserved by default ;)

@hackebrot
Copy link
Member

Thank you @eviweb for submitting this PR! 🙇

We will try to give it a thorough review soon.

@hackebrot hackebrot added the enhancement This issue/PR relates to a feature request. label May 4, 2016
@@ -50,7 +52,7 @@ def find_hooks():
return r


def run_script(script_path, cwd='.'):
def run_script(script_path, cwd='.', context={}):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this makes run_script() accept a context dict, you might as well combine run_script() and run_script_with_context() into one function. Then if run_script_with_context() is too long, feel free to refactor it somehow if you think it will make it more readable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to refactoring and getting rid of the mutable default arg

@pydanny
Copy link
Member

pydanny commented May 14, 2016

I like the idea of this pull request. It would allow us to simplify some of the things we've done with Cookiecutter Django. That simplification means we could add more sophistication to that project. So in general, I'm in favor of this pull request. 😄

That said, here is my critique so far:

  1. It took me a few minutes to wrap my head around the double negative of no_hookcopy. I think this feature would be better served as being called via _hookcopy (or _hook_copy) that defaults to True. To trigger the behavior in this pull request, one would set the value to False.
  2. If the word Yes is used instead of True/False, then why not check that via text.lower() == 'yes'?

@@ -84,6 +96,13 @@ def run_script_with_context(script_path, cwd, context):
:param cwd: The directory to run the script from.
:param context: Cookiecutter project template context.
"""
if '_no_hookcopy' in context and re.match(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check this via context['_no_hookcopy'].lower() == 'yes'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👎 on this functionality- seems like an unnecessary optimisation to me?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I take ☝️ back, it is useful to have access to the template directory in the hooks.
I think the docs would benefit from an example that shows this off.

@michaeljoseph
Copy link
Contributor

I think we should change hook_copy to run_in_place, since this is the change being made?
It also makes the primary benefit of running in-place a little clearer (you can have access to the template directory).

Disabling hook duplication
~~~~~~~~~~~~~~~~~~~~~~~~~~
As explained in `Using Pre/Post-Generate Hooks (0.7.0+)`_, you can disable the default behaviour of hook duplication. This is interesting when you don't need to use template variables directly in your hook.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a hook example that demonstrates using __file__ to locate and use the template directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also important to note that you cannot use template variables in hook scripts if you're running in-place (and you would therefore have to use stdin context if needed)

@eviweb
Copy link
Contributor Author

eviweb commented May 15, 2016

hi,
thanks for your feedbacks, here are all the modifications among which the notable are:

  • run_script was removed in favor of the refactored run_script_with_context
  • _no_hookcopy was renamed as _run_hook_in_place and is now of boolean type.
    I preferred keeping the word hook in the setting name, it should be more explicit in a configuration context.
  • hooks docs were extracted to their own page
  • hooks docs contain an example on how retrieving the template directory from the run hook

This PR was much more a proof of concept about running hooks in place while keeping the ability of getting context information in hooks.
I think Cookiecutter would gain to provide a real plugin API.
Hooks should be considered as plugins and should be able to register for some events.
This would allow authors to create dedicated plugins for one event (ie: pre_gen_project) and it would facilitate the creation of new entry points for hooks.
I will try to propose something if nothing is already started.

Regarding this particular PR, I will try to fix codecov and appveyor failing tests asap.
If you have some advices to help me (I just begin in Python) I would appreciate ;)
Should it be interesting to be able to change the context [de]serializer used to pass the Cookiecutter context through the stdin ?

best regards

Eric

@codecov-io
Copy link

codecov-io commented May 15, 2016

Current coverage is 100%

Merging #694 into master will not change coverage

@@           master   #694   diff @@
====================================
  Files          13     14     +1   
  Lines         583    779   +196   
  Methods         0      0          
  Messages        0      0          
  Branches        0      0          
====================================
+ Hits          583    779   +196   
  Misses          0      0          
  Partials        0      0          

Sunburst

Powered by Codecov. Last updated by 36a0080...993379d

@michaeljoseph
Copy link
Contributor

michaeljoseph commented May 16, 2016

@eviweb: regarding appveyor failures, it looks like the hook script is done executing before you can write to stdin, and this behaviour changed in python >= 3.3.5: http://bugs.python.org/issue19612.
There's a good explanation here: http://stackoverflow.com/questions/23688492/oserror-errno-22-invalid-argument-in-subprocess

@eviweb
Copy link
Contributor Author

eviweb commented May 17, 2016

@michaeljoseph, thanks for your feedback.
I had also found these links. But what is disturbing is that I encountered this issue only on appveyor.
I was not able to reproduce it on my windows box :(
I provide a fix that handles this issue by catching the exception on windows config and log it at a warning logging level. But I didn't find any other solution to prevent this.
If you have any idea on how to solve this issue, feel free to share, I'm really interested.
I have updated the documentation with a warning topic, that quickly explains this issue. If someone can review this paragraph, my english is not so good :(
I hope all will be good now.

Best regards

Eric

@eviweb
Copy link
Contributor Author

eviweb commented May 17, 2016

Here we are !
Please let me now if something should be improved and if it is interesting to permit authors to use their own customized [de]serializers instead of the default JSON one.
Best regards
Eric

@hackebrot
Copy link
Member

Hey folks,

I haven't had a chance to look at this in depth. That being said, what do we need testfixtures for?

@eviweb
Copy link
Contributor Author

eviweb commented May 17, 2016

@hackebrot, for the log capture utility testfixtures.LogCapture and I solved an issue encountered while using pytest.raises to catch exceptions by using testfixtures.ShouldRaise.
But I'm opened to any suggestions, I'm completely new to Python programming.

Eric

@hackebrot
Copy link
Member

Thanks for the explanation @eviweb! 🙇

I am trying to be extra careful with adding dependencies and pytest has raises. What was the issue with it? Do you have a traceback or even better a ticket in the pytest issue tracker?

@eviweb
Copy link
Contributor Author

eviweb commented May 18, 2016

@hackebrot, you're perfectly right regarding adding dependencies.
The issue I encountered with pytest.raises was that it didn't catch the OSError which resulted in a non initialized ExceptionInfo object. While I inspired myself with what was done in ./tests/test_hooks.py, I certainly have done something incorrectly.
I gave a new try this morning and all seem to be fine now.
I also succeed (with a lot of tenacity ! 😄) in mocking logging.warn so we can get rid of a no more needed testfixtures dependency.

Hope all will be fine

Eric

@@ -8,6 +8,7 @@ deps = pytest
pytest-cov
pytest-mock
freezegun
mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get the mock dependency from pytest-mock already...

@eviweb
Copy link
Contributor Author

eviweb commented May 25, 2016

@michaeljoseph,
removing this dependency makes tests failing for python 3.x.
The reason is that the mock library is included by default in the unittest package.
The solution I've found to get around this issue is to make the import using a try/except block like this:

try:
    import mock
except ImportError:
    from unittest import mock

I don't know if this is the better solution to get rid of the tox.ini explicit dependency to the mock package.
Is there a better way to deal with this ?

thanks

Eric

@eviweb
Copy link
Contributor Author

eviweb commented May 25, 2016

All tests passed with the fix explained above.
You can refer to eviweb/cookiecutter@0730b32 to see what it looks like.
Is it satisfying for you ?

Eric

@eviweb
Copy link
Contributor Author

eviweb commented Jun 2, 2016

@michaeljoseph, please forget my two previous posts.
I've found a solution.

@eviweb eviweb changed the title Hooks enhancement: running real file in place + context serialization WIP - Hooks enhancement: running real file in place + context serialization Jun 8, 2016
Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
eviweb added 18 commits June 8, 2016 13:53
the abstract class permit to uniformize the serializer interface and to
prevent some encoding issues under the hood

the pickle serializer will be needed to keep the serializer facade
configuration during hook calls

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
as they are no more needed

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
regarding the facade:
- it does not accept a type argument anymore
- instead the facade provides a fluent method 'use' to specify the
serializer type to use

on the config side the _serializers key provides two settings:
- classes: to reference the serializers implementations
- use: to specify the serializer to use

Signed-off-by: Eric Villard <dev@eviweb.fr>
[ci skip]

Signed-off-by: Eric Villard <dev@eviweb.fr>
it provides some helper classes:
- Runner: run cookierunner main command for testing
- SettingObject: object to configure the runtime environment

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
as it needs a serious refactoring

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
extract all common behaviours to support.AbstractAcceptanceTest class
it is the base class to write acceptance tests

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
it failed with a real cookiecutter context
the base64 encoding contained linebreaks that were not well handled by
popen.communicate

Signed-off-by: Eric Villard <dev@eviweb.fr>
get_context: to be used in hooks to retrieve the context object
put_context: to be used in hooks to update the context object

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
@eviweb eviweb force-pushed the real-hooks-serialized-context branch from 25e2300 to 4017e44 Compare June 8, 2016 11:54
@eviweb
Copy link
Contributor Author

eviweb commented Jun 8, 2016

Hi everyone,
here is an improvement of this PR.
First I would like to apologize because the following should be a little bit long, I will try to be as shorter as I can...
To summarize, this PR proposes two new main features that are close together.

Running real hooks in place

From a template author point of view, running real hooks in place is simply done by setting the key _run_hook_in_place to true in the cookiecutter.json configuration file as below:

{
    "full_name": "Your name",
    "email": "Your address email (eq. you@example.com)",
    "github_username": "Your github username",
    "project_name": "Name of the project",
    "project_slug": "{{ cookiecutter.project_name.lower().replace(' ', '-') }}",
    "project_short_description": "A short description of the project",
    "release_date": "{% now 'local' %}",
    "version": "0.1.0",
    "_extensions": ["jinja2_time.TimeExtension"],
    "_run_hook_in_place": "true"
}

Context serialization

From a python hook author point of view, using the cookiecutter context from within a hook is simply done by consuming the new serialization module API :

  • get_context() : to get the context object
  • put_context(context): to update the context dictionary

Concretely, imagine your post_gen_project.py hook needs some dedicated configuration info from your pre_gen_project.py hook, it could look like:

  • pre_gen_project.py:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from cookiecutter.serialization import get_context, put_context

# adding some values to be consumed by the post_gen_project
context = get_context()
context['for_post_gen_project'] = 'From pre_gen_project to post_gen_project'
put_context(context)
  • post_gen_project.py:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from cookiecutter.serialization import get_context, put_context

context = get_context()
print(context['for_post_gen_project'])

From a shell hook author it is a little bit more complicated, but the serialization API offers all the services to achieve this goal.
Before entering in more details, a quick note on the serialization public API.

Serialization API

It is composed of five module functions:

  • get_context() : to get the context object Should be used from hooks
  • put_context(context): to update the context object Should be used from hooks
  • make_persistent(subject): to make a given subject persistent between processes
  • get_persistent(klass): to retrieve an object from its persistent state
  • remove_persisent(klass): to remove a persistent object

and one class:

  • SerializationFacade: used to serialize/deserialize subjects

The SerializationFacade is an abstraction that shadows the underlying serialization process and the embedded serializers.
It offers a clear and unique interface for its clients.
By default it provides two serializers:

  • JsonSerializer: used to serialize json serializable objects (ie. the cookiecutter context), it is the default one
  • PickleSerializer: used to serialize objects with their internal state (ie. make an object persistent)

Calling the SerializationFacade.serialize() method will always produce a string following this pattern: [serializer_id]|[base64_encoded]$, where:

  • [serializer_id] is the key under which a serializer is referenced by the facade and in this case, the serializer used to serialize the subject
  • [base64_encoded] is a base64 representation of the serialized string where all \n chars are replaced by *

Note

  • Why using base64 encoding ?
    It permits to solve a bunch of encoding charset related issues that I encountered
  • Why replacing '\n' chars by * ?
    It solves an issue encountered when writing to stdin - subprocess.Popen.communicate().
    * was arbitrarily chosen but is not part of the characters used in base64 encoding

So, to come back to the side of a shell hook author, it might be preferable to deal with a different serialization format than JSON.
To achieve this goal it should be interesting to create our own serializer.
The AbstractSerializer class is here to help. Nothing to do else focusing on the serialization/deserialization process.
Here is an example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from cookiecutter.serialization import AbstractSerializer

class OurOwnSerializer(AbstractSerializer):
    def _do_serialize(self, subject):
        """
        implement this method to do the serialization
        """

    def _do_deserialize(self, string):
        """
        implement this method to do the deserialization
        """

The AbstractSerializer takes care about the encoding parts.

But what to do now ?
How can we reference our own serializer ?
How can we make it usable by cookiecutter ?
To reference our own serializer, simply add this information:

{
    "_serializers": {
        "use": "ourown",
        "classes": {
            "ourown": "serializers.OurOwnSerializer"
        }
    }
{

in the template cookiecurrer.json configuration file as below :

{
    "full_name": "Your name",
    "email": "Your address email (eq. you@example.com)",
    "github_username": "Your github username",
    "project_name": "Name of the project",
    "project_slug": "{{ cookiecutter.project_name.lower().replace(' ', '-') }}",
    "project_short_description": "A short description of the project",
    "release_date": "{% now 'local' %}",
    "version": "0.1.0",
    "_extensions": ["jinja2_time.TimeExtension"],
    "_serializers": {
        "use": "ourown",
        "classes": {
            "ourown": "serializers.OurOwnSerializer"
        }
    }
}

A perceptive mind would have noticed this part serializers.OurOwnSerializer that indicates the class is defined in a file named serializers.py (the name does not really import but must be set in accord with the module name in the json file).
So to make it usable by cookiecutter, simply create an ./extra directory at the root of the template directory and put your file in.
And what ?
Nothing else...

Coming soon

As a bonus feature, a new hook pre_user_prompt.py (or different name) which could be used to populate some context values just before the user is prompted for the configuration.
A concrete case is to dynamically create a list of choices of licenses from license files which are set in a directory.

This PR needs some little refactorings and I will wait for this PR #749 to be merged before updating the documentation.
All is tested and I provided some acceptance tests that show concretely how to use this work in production.
Writing these tests permits me to add a support module in the tests folder.
It provides some facilities to create acceptance tests and then be closer real use of cookiecutter.

I hope this will find you interested, and please, feel free to share your impressions.
Thanks for your attention,

best regards

Eric

Signed-off-by: Eric Villard <dev@eviweb.fr>
Signed-off-by: Eric Villard <dev@eviweb.fr>
@eviweb
Copy link
Contributor Author

eviweb commented Jun 9, 2016

Hi,
I've just refactored the hooks module.
I introduced a handy function named get_from_context to the config module.
As I haven't so much time now, here is a copy of its signature:

def get_from_context(context, key, default=None, update=False):
    """
    Get the value referenced by a given key from a given context
    Keys can be defined using dot notation to retrieve values from nested
    dictionaries
    Keys can contain integers to retrieve values from lists
    ie.
    context = {
        'cookiecutter': {
            'project_name': 'Project Name'
        }
    }
    project_name = get_from_context(context, 'cookiecutter.project_name')

    :param context: context to search in
    :param key: key to look for
    :param default: default value that will be returned if the key is not found
    :param update: if True, create the key in the context and set its value
                   using the default argument value
    """

Next step, the new pre_user_prompt.py hook...

Eric

- set_to_context: permit to set key/value pairs to a given context using
  dot notation
- get_from_cookiecutter_context: shorthand function to get values from
  the cookiecutter sub context using dot notation
- set_to_cookiecutter_context: shorthand function to set key/value pairs
  to the cookiecutter sub context using dot notation

Signed-off-by: Eric Villard <dev@eviweb.fr>
it is run before user is prompted for the template configuration
this allows to populate the cookiecutter context dynamically (ie. list
of licenses generated from files in a directory)

Signed-off-by: Eric Villard <dev@eviweb.fr>
@eviweb
Copy link
Contributor Author

eviweb commented Jun 9, 2016

Here we are, the development phase is finished. 😅
Reviews and comments would really be appreciated.
It remains to complete the documentation and potentially do some refactoring.
I hope you will find this work useful, for my part I think it is a really valuable improvement.
I took care to provide some simple APIs that should be very productive and easy to take in hand and adopt.
To summarize, the provided features are:

  • the ability to run real hooks in place
  • the context serialization which permit to get/set context values from hooks
  • a new hook: pre_user_prompt to populate the cookiecutter context just before the user is prompted for the template configuration (take a look at the test_run_pre_user_prompt_hook in tests/test_real_hook_acceptance.py and its fixtures under tests/test-real-hook-acceptance/templates/pre_user_prompt directory for a concrete example on how to dynamically build a list of license choices)
  • lazy loading from a ./extra directory
  • 4 helper functions to deal with the context configuration:
    • get_from_context: to get values from a given context using dot notation (see my previous topic)
    • set_to_context: to set key/value pairs to a given context using dot notation
    • get_from_cookiecutter_context: same as get_from_context but it prefixes the given key with cookiecutter. which limit the search under the cookiecutter sub context
    • set_to_cookiecutter_context: same as set_to_context but it prefixes the given key with cookiecutter. which limit the insertion under the cookiecutter sub context
  • hooks failure messages are now caught from the stderr and are added to the FailedHookException message
  • the support module under ./tests to help in writing acceptance tests

I think that's all...
I was very cautious not to generate any regression.
All previous features work as before.
As I said yesterday, I will wait for the PR #749 to be merged before updating the doc.
Is there any plan about this merging ?

Thanks for your help

Eric

@michaeljoseph
Copy link
Contributor

Hi @eviweb 😄

Thanks for all this code- it's an impressive amount of work!
However, I think it's unlikely for this PR to get accepted in it's current form because it's making so many changes- this makes it daunting and time-consuming for us to review.
So I'd suggest breaking this up into multiple pieces of work and re-submitting:

  • the running hooks in place functionality was pretty much ready to go
  • I like the idea of the pre_user_prompt hook
  • the context serialisation work is quite complicated- the original approach of passing json to the hooks on stdin would be more appropriate I think.

Thanks again and please keep contributing 😸

@eviweb
Copy link
Contributor Author

eviweb commented Jun 13, 2016

Hi @michaeljoseph,
first I would like to share my first feeling after having read your message.
I don't want to hide that I was a little bit disappointed, not by the fact this PR cannot be accepted in its state, but by the given reason and particularly this one time-consuming for us to review, even if I understand what you mean.
Yes it will take time to review this code, but I believe less than you think, and much less than I spent on this work. I also have a family, a job, a lot of things to do, but as I found your project interesting, I accepted to spend my free time and parts of my nights on this PR, taking care of producing what I think is good code, without forgetting its related documentation.
Doing this, I would naively expect that some of the people invested in this project could take a half or one hour of his/her own time to review a part of this feature.
Which seems not to be the case.
It firstly discourage me going further, because splitting this feature that I personally consider as a whole:

  • will be time-consuming for me, and I sincerely prefer to spend this time in writing the missing documentation, and potentially working on what I found during this work that really lacks: a plugin API
  • it won't save the review time of new PRs. The only benefit I see is to save the time of "search", even if each modification is pretty obvious
  • I have no guarantee on my side that new PRs won't go the same way as this one

That's being said and with the time of the week-end, I decided to put my susceptibility aside 😌.
For now I'm still blocked by this PR #749.
To save the time of everyone of us, maybe can I participate to the review process by organizing a step by step review, extracting each part one by one with my comments, either here or in a new linked issue. Then we can discuss about the targeted modification and decide later about the opportunity of splitting this work or not.
Is it something conceivable ?

Best regards

Eric

@ssbarnea
Copy link
Member

ssbarnea commented Jan 8, 2020

I assume it was abandoned.

@ssbarnea ssbarnea closed this Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue/PR relates to a feature request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants