Generate context objects in read_file() #795

Merged
merged 22 commits into from Jun 15, 2013

Conversation

Projects
None yet
5 participants
@wking
Contributor

wking commented Mar 22, 2013

This pulls commits 1ad4afc through c0a4a50 off of #671. All of the
earlier #671 work is already in master, so this shoul be ready to
go once we deal with any internal issues.

The assorted generators all use read_file() to read in the file
contents and metadata. Previously, they sometimes parse additional
metadata, fire off signals, and initialize a pelican.context.Content
subclass on their own. Reduce duplicated code and increase
consistency by shifting all that stuff into read_file() itself.

The tricky part here is that this makes some backwards incompatible
changes in the name of consistency. For example, a goodly number of
signals were renamed to standardize on *_generator_* instead of
*_generate_*. I haven't gone through and added formal deprecation
support for these changes yet (a la deprecated_attribute), but it
should be possible to work something like that up if folks feel it is
very important.

FILES_TO_COPY has also been replaced with the more general
EXTRA_PATH_METADATA and the artificial static/ path's are gone. I
think this is an incredibly useful change (arbitrary metadata for
static content! Consistent metadata extraction!), but it is a major
shift. It's harder to keep backwards compatibility here without some
really ugly conditionals.

Finally, there are a largish number of commits in this pull request,
since my approach was to make the fundamental shifts, and then
gradually clean up the bits that were newly broken. I think this
makes for more understandable commits, but I'd be willing to squash
things together if others feel like that would make things clearer.
The best way to simplify would be to split this into
sub-pull-requests, but I couldn't find any good lines for breaking
this one up.

pelican/contents.py
@@ -93,7 +95,8 @@ def __init__(self, content, metadata=None, settings=None,
if hasattr(self, 'lang') and self.lang in settings['DATE_FORMATS']:
self.date_format = settings['DATE_FORMATS'][self.lang]
else:
- self.date_format = settings['DEFAULT_DATE_FORMAT']
+ self.date_format = settings.get(
+ 'DEFAULT_DATE_FORMAT', '%a %d %B %Y')

This comment has been minimized.

@almet

almet Mar 22, 2013

Member

that should go in settings.py I guess.

@almet

almet Mar 22, 2013

Member

that should go in settings.py I guess.

This comment has been minimized.

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:25:18AM -0700, Alexis Metaireau wrote:

@@ -93,7 +95,8 @@ def init(self, content, metadata=None, settings=None,
if hasattr(self, 'lang') and self.lang in settings['DATE_FORMATS']:
self.date_format = settings['DATE_FORMATS'][self.lang]
else:

  •            self.date_format = settings['DEFAULT_DATE_FORMAT']
    
  •            self.date_format = settings.get(
    
  •                'DEFAULT_DATE_FORMAT', '%a %d %B %Y')
    

that should go in settings.py I guess.

Will fix…

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:25:18AM -0700, Alexis Metaireau wrote:

@@ -93,7 +95,8 @@ def init(self, content, metadata=None, settings=None,
if hasattr(self, 'lang') and self.lang in settings['DATE_FORMATS']:
self.date_format = settings['DATE_FORMATS'][self.lang]
else:

  •            self.date_format = settings['DEFAULT_DATE_FORMAT']
    
  •            self.date_format = settings.get(
    
  •                'DEFAULT_DATE_FORMAT', '%a %d %B %Y')
    

that should go in settings.py I guess.

Will fix…

This comment has been minimized.

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:25:18AM -0700, Alexis Metaireau wrote:

@@ -93,7 +95,8 @@ def init(self, content, metadata=None, settings=None,
if hasattr(self, 'lang') and self.lang in settings['DATE_FORMATS']:
self.date_format = settings['DATE_FORMATS'][self.lang]
else:

  •            self.date_format = settings['DEFAULT_DATE_FORMAT']
    
  •            self.date_format = settings.get(
    
  •                'DEFAULT_DATE_FORMAT', '%a %d %B %Y')
    

that should go in settings.py I guess.

Thinking about this more, I've gone the way I have because I expect to
occasionally get arbitrary settings dicts. Since these dicts may not
have the desired key, we need a default and it's easy to store this
default locally. Maybe a better approach would be:

self.date_format = settings.get(
'DEFAULT_DATE_FORMAT',
pelican.settings._DEFAULT_CONFIG['DEFAULT_DATE_FORMAT'])

which could be encapsulated in a pelican.settings.get_setting() helper
function:

def get_setting(settings, key):
if key in settings:
return settings[key]
return _DEFAULT_CONFIG[key]

which handles the "someone passed me {} as a settings dictionary" case
well. This would also make the tests cleaner, since many of them
currently copy _DEFAULT_CONFIG and tweak a few variables, while with
this approach they could just pass a small dict of overrides.

p.s. I'm not sure why _DEFAULT_CONFIG has an underscore. It is
often accessed by other Pelican modules, including plugins ;).

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:25:18AM -0700, Alexis Metaireau wrote:

@@ -93,7 +95,8 @@ def init(self, content, metadata=None, settings=None,
if hasattr(self, 'lang') and self.lang in settings['DATE_FORMATS']:
self.date_format = settings['DATE_FORMATS'][self.lang]
else:

  •            self.date_format = settings['DEFAULT_DATE_FORMAT']
    
  •            self.date_format = settings.get(
    
  •                'DEFAULT_DATE_FORMAT', '%a %d %B %Y')
    

that should go in settings.py I guess.

Thinking about this more, I've gone the way I have because I expect to
occasionally get arbitrary settings dicts. Since these dicts may not
have the desired key, we need a default and it's easy to store this
default locally. Maybe a better approach would be:

self.date_format = settings.get(
'DEFAULT_DATE_FORMAT',
pelican.settings._DEFAULT_CONFIG['DEFAULT_DATE_FORMAT'])

which could be encapsulated in a pelican.settings.get_setting() helper
function:

def get_setting(settings, key):
if key in settings:
return settings[key]
return _DEFAULT_CONFIG[key]

which handles the "someone passed me {} as a settings dictionary" case
well. This would also make the tests cleaner, since many of them
currently copy _DEFAULT_CONFIG and tweak a few variables, while with
this approach they could just pass a small dict of overrides.

p.s. I'm not sure why _DEFAULT_CONFIG has an underscore. It is
often accessed by other Pelican modules, including plugins ;).

This comment has been minimized.

@almet

almet Mar 24, 2013

Member

you're right on both fronts.

  1. We should remove this underscore :)
  2. Having a get_setting is probably a good idea, which helps testing for instance.
@almet

almet Mar 24, 2013

Member

you're right on both fronts.

  1. We should remove this underscore :)
  2. Having a get_setting is probably a good idea, which helps testing for instance.

This comment has been minimized.

@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 06:40:42PM -0700, Alexis Metaireau wrote:

  1. We should remove this underscore :)
  2. Having a get_setting is probably a good idea, which helps testing
    for instance.

I'll add these and repush.

@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 06:40:42PM -0700, Alexis Metaireau wrote:

  1. We should remove this underscore :)
  2. Having a get_setting is probably a good idea, which helps testing
    for instance.

I'll add these and repush.

pelican/contents.py
@@ -108,8 +111,8 @@ def __init__(self, content, metadata=None, settings=None,
# manage status
if not hasattr(self, 'status'):
- self.status = settings['DEFAULT_STATUS']
- if not settings['WITH_FUTURE_DATES']:
+ self.status = settings.get('DEFAULT_STATUS', 'published')

This comment has been minimized.

@almet

almet Mar 22, 2013

Member

don't put the default values in here, but in settings.py

@almet

almet Mar 22, 2013

Member

don't put the default values in here, but in settings.py

pelican/contents.py
- self.status = settings['DEFAULT_STATUS']
- if not settings['WITH_FUTURE_DATES']:
+ self.status = settings.get('DEFAULT_STATUS', 'published')
+ if not settings.get('WITH_FUTURE_DATES', True):

This comment has been minimized.

@almet

almet Mar 22, 2013

Member

ditto

@almet

almet Mar 22, 2013

Member

ditto

@@ -169,6 +172,9 @@ def _update_content(self, content, siteurl):
:param siteurl: siteurl which is locally generated by the writer in
case of RELATIVE_URLS.
"""
+ if not content:
+ return content

This comment has been minimized.

@almet

almet Mar 22, 2013

Member

maybe we should make this consistent and be sure we return a string? "return ''"

@almet

almet Mar 22, 2013

Member

maybe we should make this consistent and be sure we return a string? "return ''"

This comment has been minimized.

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:26:50AM -0700, Alexis Metaireau wrote:

@@ -169,6 +172,9 @@ def _update_content(self, content, siteurl):
:param siteurl: siteurl which is locally generated by the writer in
case of RELATIVE_URLS.
"""

  •    if not content:
    
  •        return content
    

maybe we should make this consistent and be sure we return a string?
"return ''"

The only way content should be None is for Content instances
where you don't care about the content (i.e. Static, since you just
copy the files without peeking inside). I don't see any reason to
return a string in that case.

If you're worried about the other Content subclasses, if content
starts out as the empty string, it's going to end up as the empty
string anyway, so cutting out early has no effect.

@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:26:50AM -0700, Alexis Metaireau wrote:

@@ -169,6 +172,9 @@ def _update_content(self, content, siteurl):
:param siteurl: siteurl which is locally generated by the writer in
case of RELATIVE_URLS.
"""

  •    if not content:
    
  •        return content
    

maybe we should make this consistent and be sure we return a string?
"return ''"

The only way content should be None is for Content instances
where you don't care about the content (i.e. Static, since you just
copy the files without peeking inside). I don't see any reason to
return a string in that case.

If you're worried about the other Content subclasses, if content
starts out as the empty string, it's going to end up as the empty
string anyway, so cutting out early has no effect.

This comment has been minimized.

@almet

almet Mar 24, 2013

Member

that's for consistency: if we expect this to return a string, then it should probably return a string (but that's not an important thing anyway)

@almet

almet Mar 24, 2013

Member

that's for consistency: if we expect this to return a string, then it should probably return a string (but that's not an important thing anyway)

This comment has been minimized.

@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 06:42:35PM -0700, Alexis Metaireau wrote:

@@ -169,6 +172,9 @@ def _update_content(self, content, siteurl):
:param siteurl: siteurl which is locally generated by the writer in
case of RELATIVE_URLS.
"""

  •    if not content:
    
  •        return content
    

that's for consistency: if we expect this to return a string, then
it should probably return a string (but that's not an important
thing anyway)

How about bumping this cutout up the stack into .get_content():

@memoized
def get_content(self, siteurl):

    if self._content and hasattr(self, '_get_content'):
        content = self._get_content()
    else:
        content = self._content
    return self._update_content(content, siteurl)

although then the inconsistency is in get_content() :p. At some
point, content is going to be either None or a string. We currently
support both in ._content, but not in ._update_content. I'm fine
with fixing this inconsistency at almost any level, but I don't think
we should be using the empty string (instead of None) for
Static._content.

@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 06:42:35PM -0700, Alexis Metaireau wrote:

@@ -169,6 +172,9 @@ def _update_content(self, content, siteurl):
:param siteurl: siteurl which is locally generated by the writer in
case of RELATIVE_URLS.
"""

  •    if not content:
    
  •        return content
    

that's for consistency: if we expect this to return a string, then
it should probably return a string (but that's not an important
thing anyway)

How about bumping this cutout up the stack into .get_content():

@memoized
def get_content(self, siteurl):

    if self._content and hasattr(self, '_get_content'):
        content = self._get_content()
    else:
        content = self._content
    return self._update_content(content, siteurl)

although then the inconsistency is in get_content() :p. At some
point, content is going to be either None or a string. We currently
support both in ._content, but not in ._update_content. I'm fine
with fixing this inconsistency at almost any level, but I don't think
we should be using the empty string (instead of None) for
Static._content.

from pelican.utils import get_date, pelican_open
+logger = logging.getLogger(__name__)

This comment has been minimized.

@almet

almet Mar 22, 2013

Member

yay more logging.

@almet

almet Mar 22, 2013

Member

yay more logging.

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet Mar 22, 2013

Member

Seems generally okay.

  • make sure compat stays okay
  • don't set the default values when doing settings.get() (or we can do that everywhere instead, maybe that would be a better practice)
Member

almet commented Mar 22, 2013

Seems generally okay.

  • make sure compat stays okay
  • don't set the default values when doing settings.get() (or we can do that everywhere instead, maybe that would be a better practice)
@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Mar 22, 2013

Contributor

On Fri, Mar 22, 2013 at 12:31:40AM -0700, Alexis Metaireau wrote:

  • make sure compat stays okay

Does this mean we need to figure out how to support FILES_TO_COPY
with the new setup? :(

Contributor

wking commented Mar 22, 2013

On Fri, Mar 22, 2013 at 12:31:40AM -0700, Alexis Metaireau wrote:

  • make sure compat stays okay

Does this mean we need to figure out how to support FILES_TO_COPY
with the new setup? :(

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet Mar 24, 2013

Member

We don't need to figure out how to deal with it seamlessly, but we need to issue a warning if people are using the old way of handling this, so they know it's not working anymore (and eventually how to fix it)

Member

almet commented Mar 24, 2013

We don't need to figure out how to deal with it seamlessly, but we need to issue a warning if people are using the old way of handling this, so they know it's not working anymore (and eventually how to fix it)

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet Mar 24, 2013

Member

I'm wondering if we want to actually provide a back compat for signals; Maybe the right way is to break things up and to issue a new version of pelican (4.0) with all these changes?

Member

almet commented Mar 24, 2013

I'm wondering if we want to actually provide a back compat for signals; Maybe the right way is to break things up and to issue a new version of pelican (4.0) with all these changes?

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 06:43:23PM -0700, Alexis Metaireau wrote:

We don't need to figure out how to deal with it seamlessly, but we
need to issue a warning if people are using the old way of handling
this, so they know it's not working anymore (and eventually how to
fix it)

Ah, that I can do. I'll work up a patch for configure_settings().

Contributor

wking commented Mar 24, 2013

On Sat, Mar 23, 2013 at 06:43:23PM -0700, Alexis Metaireau wrote:

We don't need to figure out how to deal with it seamlessly, but we
need to issue a warning if people are using the old way of handling
this, so they know it's not working anymore (and eventually how to
fix it)

Ah, that I can do. I'll work up a patch for configure_settings().

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Mar 24, 2013

Contributor

On Sat, Mar 23, 2013 at 07:09:22PM -0700, Alexis Metaireau wrote:

I'm wondering if we want to actually provide a back compat for
signals;

If folks are using blinker directly, this could be difficult. If they
are importing from pelican.signals, then we should be able to get away
with:

article_generator_preread = signal('article_generator_preread')
article_generate_preread = article_generator_preread # backwards compat

It may be possible to figure out a general object deprecator
(analagous to deprecated_attribute) so we can do:

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
since=(3, 2), remove=(4, 0),
doc='See http://docs.getpelican.com/en/3.2/plugins.html#list-of-signals')
article_generate_preread = article_generator_preread # backwards compat

Although it may be worth adding a "what's new" page to the docs,
instead of splitting ugrade information up by category.

Maybe the right way is to break things up and to issue a new version
of pelican (4.0) with all these changes?

I'm not a big fan of this sort of parallel development. If you expect
new features in the 3.x branch, porting between 4.x and 3.x (in either
direction) will be difficult. If you don't expect new features in the
3.x branch, it's fine to say something like:

A good deal of Pelican's internal structure changed in the run-up to
4.0. For folks who are unable to immediately transition to the new
settings, we'll continue to apply bug fixes to the 3.x branch for
another $N months. New feature development will be limited to the
4.x series.

There is much less burden from a maintenance perspective if you give
everyone some breathing room and then force them over the hump :p.

Contributor

wking commented Mar 24, 2013

On Sat, Mar 23, 2013 at 07:09:22PM -0700, Alexis Metaireau wrote:

I'm wondering if we want to actually provide a back compat for
signals;

If folks are using blinker directly, this could be difficult. If they
are importing from pelican.signals, then we should be able to get away
with:

article_generator_preread = signal('article_generator_preread')
article_generate_preread = article_generator_preread # backwards compat

It may be possible to figure out a general object deprecator
(analagous to deprecated_attribute) so we can do:

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
since=(3, 2), remove=(4, 0),
doc='See http://docs.getpelican.com/en/3.2/plugins.html#list-of-signals')
article_generate_preread = article_generator_preread # backwards compat

Although it may be worth adding a "what's new" page to the docs,
instead of splitting ugrade information up by category.

Maybe the right way is to break things up and to issue a new version
of pelican (4.0) with all these changes?

I'm not a big fan of this sort of parallel development. If you expect
new features in the 3.x branch, porting between 4.x and 3.x (in either
direction) will be difficult. If you don't expect new features in the
3.x branch, it's fine to say something like:

A good deal of Pelican's internal structure changed in the run-up to
4.0. For folks who are unable to immediately transition to the new
settings, we'll continue to apply bug fixes to the 3.x branch for
another $N months. New feature development will be limited to the
4.x series.

There is much less burden from a maintenance perspective if you give
everyone some breathing room and then force them over the hump :p.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Mar 24, 2013

Contributor

On Sun, Mar 24, 2013 at 07:11:36AM -0400, W. Trevor King wrote:

It may be possible to figure out a general object deprecator
(analagous to deprecated_attribute) so we can do:

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
since=(3, 2), remove=(4, 0),
doc='See http://docs.getpelican.com/en/3.2/plugins.html#list-of-signals')
article_generate_preread = article_generator_preread # backwards compat

It turns out that this is not possible without a lot of contortion.
What we want is something along the lines of:

def deprecated(old, new, new_value, since, remove):
def _fget():
print_warning()
return(new_value)
def decorator(dummy):
return property(fget=_fget)
return property

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
new_value=article_generator_preread,
since=(3, 2), remove=(4, 0))
def article_generate_preread():
return None

But property() instances only work as class attributes (not as module
attributes). There are ways of getting around this [1,2], but they
aren't pretty.

Contributor

wking commented Mar 24, 2013

On Sun, Mar 24, 2013 at 07:11:36AM -0400, W. Trevor King wrote:

It may be possible to figure out a general object deprecator
(analagous to deprecated_attribute) so we can do:

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
since=(3, 2), remove=(4, 0),
doc='See http://docs.getpelican.com/en/3.2/plugins.html#list-of-signals')
article_generate_preread = article_generator_preread # backwards compat

It turns out that this is not possible without a lot of contortion.
What we want is something along the lines of:

def deprecated(old, new, new_value, since, remove):
def _fget():
print_warning()
return(new_value)
def decorator(dummy):
return property(fget=_fget)
return property

article_generator_preread = signal('article_generator_preread')
@deprecated(
old='article_generate_preread', new='article_generator_preread',
new_value=article_generator_preread,
since=(3, 2), remove=(4, 0))
def article_generate_preread():
return None

But property() instances only work as class attributes (not as module
attributes). There are ways of getting around this [1,2], but they
aren't pretty.

pelican/settings.py
+ 'DEFAULT_DATE_FORMAT': '%a %d %B %Y',
+ 'DATE_FORMATS': {},
+ 'ASCIIDOC_OPTIONS': [],
+ 'MD_EXTENSIONS': ['codehilite', 'extra'],

This comment has been minimized.

@avaris

avaris Apr 29, 2013

Member

This was changed to MD_EXTENSIONS = ['codehilite(css_class=highlight)', 'extra']

@avaris

avaris Apr 29, 2013

Member

This was changed to MD_EXTENSIONS = ['codehilite(css_class=highlight)', 'extra']

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet May 13, 2013

Member

Where are we with this? Should it be reviewed, ready for inclusion?

Member

almet commented May 13, 2013

Where are we with this? Should it be reviewed, ready for inclusion?

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking May 13, 2013

Contributor

On Sun, May 12, 2013 at 08:43:14PM -0700, Alexis Metaireau wrote:

Where are we with this? Should it be reviewed, ready for inclusion?

Practically ready for inclusion, but @tbunnyman wants a cleaner
history by rebasing the config-oriented changes into a separate PR
that would land before the read_file-oriented changes. That should
make reviewing each part easier. I won't have time to rebase and test
the results until June, so we're on hold until then.

Contributor

wking commented May 13, 2013

On Sun, May 12, 2013 at 08:43:14PM -0700, Alexis Metaireau wrote:

Where are we with this? Should it be reviewed, ready for inclusion?

Practically ready for inclusion, but @tbunnyman wants a cleaner
history by rebasing the config-oriented changes into a separate PR
that would land before the read_file-oriented changes. That should
make reviewing each part easier. I won't have time to rebase and test
the results until June, so we're on hold until then.

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet May 13, 2013

Member

Thanks for the update Trevor.

Member

almet commented May 13, 2013

Thanks for the update Trevor.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Jun 3, 2013

Contributor

On Sun, May 12, 2013 at 11:48:37PM -0400, W. Trevor King wrote:

@tbunnyman wants a cleaner history by rebasing the config-oriented
changes into a separate PR that would land before the
read_file-oriented changes.

The config changes just landed in #921, so I've rebased this (without
those changes) onto the current master. Ready for review and merging,
as far as I'm concerned (although we'll see what Travis has to say ;).

Contributor

wking commented Jun 3, 2013

On Sun, May 12, 2013 at 11:48:37PM -0400, W. Trevor King wrote:

@tbunnyman wants a cleaner history by rebasing the config-oriented
changes into a separate PR that would land before the
read_file-oriented changes.

The config changes just landed in #921, so I've rebased this (without
those changes) onto the current master. Ready for review and merging,
as far as I'm concerned (although we'll see what Travis has to say ;).

article_generator_init = signal('article_generator_init')
-article_generator_finalized = signal('article_generate_finalized')

This comment has been minimized.

@almet

almet Jun 6, 2013

Member

hmm, so this breaks the APIs. At least the plugins won't run and won't fail silently.
We probably should add a note in the docs about the deprecation.

@almet

almet Jun 6, 2013

Member

hmm, so this breaks the APIs. At least the plugins won't run and won't fail silently.
We probably should add a note in the docs about the deprecation.

This comment has been minimized.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 03:53:48PM -0700, Alexis Metaireau wrote:

We probably should add a note in the docs about the deprecation.

Will do.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 03:53:48PM -0700, Alexis Metaireau wrote:

We probably should add a note in the docs about the deprecation.

Will do.

@justinmayer

This comment has been minimized.

Show comment
Hide comment
@justinmayer

justinmayer Jun 6, 2013

Member

Interestingly enough, even though Travis is reporting that tests pass, I'm getting test failures on my local development environment (both on Python 2.7.2 and 3.3.2). Steps I took:

mkvirtualenv pelican-tmp
git clone https://github.com/wking/pelican
cd pelican
git checkout read-file
pip install -r dev_requirements.txt
python setup.py develop
python -m unittest discover

Here's the error output.

Member

justinmayer commented Jun 6, 2013

Interestingly enough, even though Travis is reporting that tests pass, I'm getting test failures on my local development environment (both on Python 2.7.2 and 3.3.2). Steps I took:

mkvirtualenv pelican-tmp
git clone https://github.com/wking/pelican
cd pelican
git checkout read-file
pip install -r dev_requirements.txt
python setup.py develop
python -m unittest discover

Here's the error output.

pelican/settings.py
@@ -257,6 +257,8 @@ def configure_settings(settings):
for old,new,doc in [
('LESS_GENERATOR', 'the Webassets plugin', None),
+ ('FILES_TO_COPY', 'STATIC_PATHS and EXTRA_PATH_METADATA',
+ 'https://github.com/getpelican/pelican/pull/795'),

This comment has been minimized.

@almet

almet Jun 6, 2013

Member

do you really plan to point users here?

@almet

almet Jun 6, 2013

Member

do you really plan to point users here?

This comment has been minimized.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 04:18:09PM -0700, Alexis Metaireau wrote:

@@ -257,6 +257,8 @@ def configure_settings(settings):

 for old,new,doc in [
         ('LESS_GENERATOR', 'the Webassets plugin', None),
  •        ('FILES_TO_COPY', 'STATIC_PATHS and EXTRA_PATH_METADATA',
    
  •         'https://github.com/getpelican/pelican/pull/795'),
    

do you really plan to point users here?

The history of this PR was less confusing when I wrote that ;). I'll
write up some docs.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 04:18:09PM -0700, Alexis Metaireau wrote:

@@ -257,6 +257,8 @@ def configure_settings(settings):

 for old,new,doc in [
         ('LESS_GENERATOR', 'the Webassets plugin', None),
  •        ('FILES_TO_COPY', 'STATIC_PATHS and EXTRA_PATH_METADATA',
    
  •         'https://github.com/getpelican/pelican/pull/795'),
    

do you really plan to point users here?

The history of this PR was less confusing when I wrote that ;). I'll
write up some docs.

- reader = EXTENSIONS[fmt](settings)
+ reader_class = EXTENSIONS[fmt]
+ if not reader_class.enabled:
+ raise ValueError('Missing dependencies for {}'.format(fmt))

This comment has been minimized.

@almet

almet Jun 6, 2013

Member

<3

+ full_path=path, source_path=source_path, settings=settings))
+ metadata.update(parse_path_metadata(
+ source_path=source_path, settings=settings,
+ process=reader.process_metadata))

This comment has been minimized.

@almet

almet Jun 6, 2013

Member

should we rename "process" to "processor"?

@almet

almet Jun 6, 2013

Member

should we rename "process" to "processor"?

This comment has been minimized.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 04:23:18PM -0700, Alexis Metaireau wrote:

should we rename "process" to "processor"?

I like verby functions, but I don't really mind.

@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 04:23:18PM -0700, Alexis Metaireau wrote:

should we rename "process" to "processor"?

I like verby functions, but I don't really mind.

@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet Jun 6, 2013

Member

I love these changes. The code is way cleaner with them and easier to understand.

I'm a bit concerned with the fact it breaks some APIs. the signals are "okay", but we should at least have a deprecation warning for the users before removing the olds ones.

That's a different story for the settings we are changing. I'm not reluctant to change them but I'm just thinking we should be extra careful and tell people at runtime that these changed. You can use the handle_deprecation method for this. See https://github.com/getpelican/pelican/blob/master/pelican/__init__.py#L82

Otherwise, a big +1 for all this.

Member

almet commented Jun 6, 2013

I love these changes. The code is way cleaner with them and easier to understand.

I'm a bit concerned with the fact it breaks some APIs. the signals are "okay", but we should at least have a deprecation warning for the users before removing the olds ones.

That's a different story for the settings we are changing. I'm not reluctant to change them but I'm just thinking we should be extra careful and tell people at runtime that these changed. You can use the handle_deprecation method for this. See https://github.com/getpelican/pelican/blob/master/pelican/__init__.py#L82

Otherwise, a big +1 for all this.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Jun 7, 2013

Contributor

On Thu, Jun 06, 2013 at 04:09:53PM -0700, Justin Mayer wrote:

Interestingly enough, even though Travis is reporting that tests
pass, I'm getting test failures on my local development environment…

Hmm, looks like part of the path is being duplicated:
pelican/tests/content/pelican/tests/content. I'll try and track
this down tomorrow.

Contributor

wking commented Jun 7, 2013

On Thu, Jun 06, 2013 at 04:09:53PM -0700, Justin Mayer wrote:

Interestingly enough, even though Travis is reporting that tests
pass, I'm getting test failures on my local development environment…

Hmm, looks like part of the path is being duplicated:
pelican/tests/content/pelican/tests/content. I'll try and track
this down tomorrow.

docs/settings.rst
+=============
+
+Not all metadata needs to be `embedded in source file itself`__. For
+exampe, blog posts are often named following a ``YYYY-MM-DD-SLUG.rst``

This comment has been minimized.

@saimn

saimn Jun 8, 2013

Contributor

s/exampe/example/

@saimn

saimn Jun 8, 2013

Contributor

s/exampe/example/

pelican/tests/test_readers.py
@@ -20,13 +20,13 @@ class ReaderTest(unittest.TestCase):
def read_file(self, path, **kwargs):
# Isolate from future API changes to readers.read_file
return readers.read_file(
- _path(path), settings=get_settings(**kwargs))
+ base_path=CONTENT_PATH, path=_path(path), settings=get_settings(**kwargs))

This comment has been minimized.

@saimn

saimn Jun 8, 2013

Contributor

this will join CONTENT_PATH twice and gives path like '/home/simon/Dev/pelican/pelican/tests/content/pelican/tests/content/article_with_metadata.rst' if CONTENT_PATH is a relative path (which is the case with unitests but not with py.test (?))

@saimn

saimn Jun 8, 2013

Contributor

this will join CONTENT_PATH twice and gives path like '/home/simon/Dev/pelican/pelican/tests/content/pelican/tests/content/article_with_metadata.rst' if CONTENT_PATH is a relative path (which is the case with unitests but not with py.test (?))

This comment has been minimized.

@wking

wking Jun 9, 2013

Contributor

On Sat, Jun 08, 2013 at 07:28:56AM -0700, Simon Conseil wrote:

this will join CONTENT_PATH twice

Ahh, that's where justmay's errors were coming from…

@wking

wking Jun 9, 2013

Contributor

On Sat, Jun 08, 2013 at 07:28:56AM -0700, Simon Conseil wrote:

this will join CONTENT_PATH twice

Ahh, that's where justmay's errors were coming from…

+ preread_signal=None, preread_sender=None,
+ context_signal=None, context_sender=None):
+ """Return a content object parsed with the given format."""
+ path = os.path.abspath(os.path.join(base_path, path))

This comment has been minimized.

@saimn

saimn Jun 8, 2013

Contributor

why is base_path needed ? It seems that path is always a full path so base_path should be equal to os.path.dirname(path).

> /home/simon/Dev/pelican/pelican/readers.py(347)read_file()
    346 
--> 347     path = os.path.abspath(os.path.join(base_path, path))
    348     source_path = os.path.relpath(path, base_path)

ipdb> base_path
u'/home/simon/Dev/pelican/samples/content'
ipdb> path
u'/home/simon/Dev/pelican/samples/content/article2-fr.rst'
@saimn

saimn Jun 8, 2013

Contributor

why is base_path needed ? It seems that path is always a full path so base_path should be equal to os.path.dirname(path).

> /home/simon/Dev/pelican/pelican/readers.py(347)read_file()
    346 
--> 347     path = os.path.abspath(os.path.join(base_path, path))
    348     source_path = os.path.relpath(path, base_path)

ipdb> base_path
u'/home/simon/Dev/pelican/samples/content'
ipdb> path
u'/home/simon/Dev/pelican/samples/content/article2-fr.rst'

wking added some commits Jan 4, 2013

readers: Instrument read_file to return Content objects
The assorted generators all use read_file() to read in the file
contents and metadata.  Previously, they sometimes parse additional
metadata, fire off signals, and initialize a pelican.contents.Content
subclass on their own.  We can reduce duplicated code and increase
consistency by shifting all that stuff into read_file() itself, and
this commit is a step in that direction.
Move Article metadata extraction from generators to readers
There's no reason why this information should be Article-specific.

This commit breaks the other generators for the moment.  I'll fix them
shortly.
signals: Add missing signals
Note that the `pages_*` names are plural , while the `article_*` names
are singular.  I'll fix this once I update the PagesGenerator.
signals: Fix *_generate_* signals -> *_generator_*
For example, article_generate_preread is now article_generator_preread
for consistency with the other preread and context signals.

wking added some commits Jan 4, 2013

generators: Update PagesGenerator to use new read_file
Also standardize signal names.  If `article_generator_*` is singular,
`page_generator_*` should be as well.  Fix it from the older
`pages_generator_*`.
Remove the FILES_TO_COPY setting
We no longer instantiate the Static object in the StaticGenerator, so
we can't set the save_as argument anymore.  If you want to adjust the
output path, use the upcoming EXTRA_PATH_METADATA setting.
Add the EXTRA_PATH_METADATA setting
Useful for altering static output paths when you don't want to encode
extra metadata in the static file's source path.
test_generators: Replace CUR_DIR with CONTENT_DIR for subdir cat. det…
…ection

In situations where I've cleared ARTICLE_DIR, I've done so to ensure
that there are no directories that will override the DEFAULT_CATEGORY
due to USE_FOLDER_AS_CATEGORY.
readers: Ensure the reader class is enabled before instantiating
Otherwise the MarkdownReader fails with:

  'bool' object is not callable

if Markdown is not installed.

Reported-by: Deniz Turgut <dturgut@gmail.com>
generators: get_files() should use paths relative to Generator.path
All paths should be relative to Generator.path unless we're actively
accessing the filesystem.  This makes the argument less ambiguous, so
we have less likelyhood of joining paths multiple times.
@justinmayer

This comment has been minimized.

Show comment
Hide comment
@justinmayer

justinmayer Jun 14, 2013

Member

I finally got a chance to do some testing. When I generated a site via 9b42b2a, I got the following warning message:

WARNING: The {} setting has been removed in favor of {}, see {} for details

... which isn't surprising since I still had FILES_TO_COPY in my settings file. But for some reason, the substitution that's supposed to be occurring here, letting me know that I have to use STATIC_PATHS and EXTRA_PATH_METADATA instead, doesn't seem to be happening — thus the empty curly brackets.

Plus, we'll probably want to link to the docs instead of this issue. (See: relevant commit)

Once I added what I presumed to be the requisite replacement settings...

STATIC_PATHS = [
    'extra',
    ]

EXTRA_PATH_METADATA = {
    'extra/apple-touch-icon-57x57-precomposed.png': {'path': 'apple-touch-icon-57x57-precomposed.png'},
    'extra/apple-touch-icon-72x72-precomposed.png': {'path': 'apple-touch-icon-72x72-precomposed.png'},
    'extra/apple-touch-icon-114x114-precomposed.png': {'path': 'apple-touch-icon-114x114-precomposed.png'},
    'extra/apple-touch-icon-precomposed.png': {'path': 'apple-touch-icon-precomposed.png'},
    'extra/apple-touch-icon.png': {'path': 'apple-touch-icon.png'},
    'extra/crossdomain.xml': {'path': 'crossdomain.xml'},
    'extra/favicon.ico': {'path': 'favicon.ico'},
    'extra/humans.txt': {'path': 'humans.txt'},
    'extra/robots.txt': {'path': 'robots.txt'},
    }

... everything behaves precisely as expected. Well done, @wking!

In short, this looks good to me, assuming we address the warning mentioned above.

Member

justinmayer commented Jun 14, 2013

I finally got a chance to do some testing. When I generated a site via 9b42b2a, I got the following warning message:

WARNING: The {} setting has been removed in favor of {}, see {} for details

... which isn't surprising since I still had FILES_TO_COPY in my settings file. But for some reason, the substitution that's supposed to be occurring here, letting me know that I have to use STATIC_PATHS and EXTRA_PATH_METADATA instead, doesn't seem to be happening — thus the empty curly brackets.

Plus, we'll probably want to link to the docs instead of this issue. (See: relevant commit)

Once I added what I presumed to be the requisite replacement settings...

STATIC_PATHS = [
    'extra',
    ]

EXTRA_PATH_METADATA = {
    'extra/apple-touch-icon-57x57-precomposed.png': {'path': 'apple-touch-icon-57x57-precomposed.png'},
    'extra/apple-touch-icon-72x72-precomposed.png': {'path': 'apple-touch-icon-72x72-precomposed.png'},
    'extra/apple-touch-icon-114x114-precomposed.png': {'path': 'apple-touch-icon-114x114-precomposed.png'},
    'extra/apple-touch-icon-precomposed.png': {'path': 'apple-touch-icon-precomposed.png'},
    'extra/apple-touch-icon.png': {'path': 'apple-touch-icon.png'},
    'extra/crossdomain.xml': {'path': 'crossdomain.xml'},
    'extra/favicon.ico': {'path': 'favicon.ico'},
    'extra/humans.txt': {'path': 'humans.txt'},
    'extra/robots.txt': {'path': 'robots.txt'},
    }

... everything behaves precisely as expected. Well done, @wking!

In short, this looks good to me, assuming we address the warning mentioned above.

settings: Fix deprecation warning in configure_settings()
The broken code came from my 1d4d86c (settings: Rework the
LESS_GENERATOR removal warning for easy extension, 2013-03-24), where
I put formatting placeholders ({}) into the warning message, but
forgot to fill them in :/.
@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Jun 14, 2013

Contributor

On Thu, Jun 13, 2013 at 06:08:44PM -0700, Justin Mayer wrote:

WARNING: The {} setting has been removed in favor of {}, see {} for details

Oops, that snuck in with #921. I've added 180cf91 to the end of
this PR to fix it.

Plus, we'll probably want to link to the docs instead of this issue.

I'd already fixed that with ce0a9b6, a new version of the commit you'd
linked to.

In short, this looks good to me, assuming we address the warning
mentioned above.

Should be good to go now.

Contributor

wking commented Jun 14, 2013

On Thu, Jun 13, 2013 at 06:08:44PM -0700, Justin Mayer wrote:

WARNING: The {} setting has been removed in favor of {}, see {} for details

Oops, that snuck in with #921. I've added 180cf91 to the end of
this PR to fix it.

Plus, we'll probably want to link to the docs instead of this issue.

I'd already fixed that with ce0a9b6, a new version of the commit you'd
linked to.

In short, this looks good to me, assuming we address the warning
mentioned above.

Should be good to go now.

@justinmayer

This comment has been minimized.

Show comment
Hide comment
@justinmayer

justinmayer Jun 14, 2013

Member

Tested again, and the warning is indeed now displayed properly. Awesome!

One last thing... Alexis voiced some concern about this PR breaking some APIs. I don't recall if that's been addressed/handled. Could you remind me?

Member

justinmayer commented Jun 14, 2013

Tested again, and the warning is indeed now displayed properly. Awesome!

One last thing... Alexis voiced some concern about this PR breaking some APIs. I don't recall if that's been addressed/handled. Could you remind me?

@justinmayer

This comment has been minimized.

Show comment
Hide comment
@justinmayer

justinmayer Jun 14, 2013

Member

In reference to my question above, in IRC @wking referred me to the docs included in this PR (8797f0e), which I clearly missed.

I think this is good to go. Any further comments before it's merged?

Member

justinmayer commented Jun 14, 2013

In reference to my question above, in IRC @wking referred me to the docs included in this PR (8797f0e), which I clearly missed.

I think this is good to go. Any further comments before it's merged?

@justinmayer

This comment has been minimized.

Show comment
Hide comment
@justinmayer

justinmayer Jun 15, 2013

Member

Many thanks to @wking for all the hard work on this effort. Much appreciated! Merged.

Member

justinmayer commented Jun 15, 2013

Many thanks to @wking for all the hard work on this effort. Much appreciated! Merged.

justinmayer added a commit that referenced this pull request Jun 15, 2013

Merge pull request #795 from wking/read-file
Generate context objects in read_file()

@justinmayer justinmayer merged commit 5d9b3d7 into getpelican:master Jun 15, 2013

1 check passed

default The Travis CI build passed
Details
@almet

This comment has been minimized.

Show comment
Hide comment
@almet

almet Jun 20, 2013

Member

yeah, this breaks the current API, can we please avoid doing that? :-)

Member

almet commented Jun 20, 2013

yeah, this breaks the current API, can we please avoid doing that? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment