Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying HTML files to output without processing them #1046

Closed
liyanage opened this issue Aug 15, 2013 · 18 comments
Closed

Copying HTML files to output without processing them #1046

liyanage opened this issue Aug 15, 2013 · 18 comments

Comments

@liyanage
Copy link

The documentation for FILES_TO_COPY says directories are allowed:

A list of files (or directories) to copy

But when I try it, it errors out at that point:

CRITICAL: (21, 'Is a directory')
CRITICAL: [Errno 21] Is a directory: u'/Users/liyanage/Documents/websites/foo.com-pelican/content/software/bar/'
@justinmayer
Copy link
Member

Hi Marc! Very sorry for the delay in responding. Your question has been somewhat obsoleted by the release of Pelican 3.3, which removed FILES_TO_COPY and instead relies on STATIC_PATHS and EXTRA_PATH_METADATA. See the Path Metadata section of the docs for more details.

As am aside, I've been using your Mac OS X tools for many, many years, so it's a pleasure to see your interest in Pelican. If there's anything further we do to assist, please don't hesitate to ask! 😺

@liyanage
Copy link
Author

Heh, small world, glad to hear my OS X packages were useful to you.

Good to hear the new version has a new feature to handle this, I will try that. Many thanks!

@liyanage
Copy link
Author

I just tried the STATIC_PATHS option, and while this does indeed no longer error out with directories, it seems to attempt to process the .html files in the directory, which I don't want:

[localhost] local: pelican content -s pelicanconf.py
WARNING: Could not process software/xxx/genindex.html
'NoneType' object has no attribute 'lower'
WARNING: Could not process software/xxx/index.html
'NoneType' object has no attribute 'lower'

@liyanage liyanage reopened this Nov 11, 2013
@justinmayer
Copy link
Member

Pelican supports HTML files as source input, along with Markdown, reST, and Asciidoc formats. I believe that particular error should be converted to a warning as of 35375b1, which was just merged and is not yet in a shipped release.

But the larger question remains... Why do you have .html files in your source content?

@liyanage
Copy link
Author

In my content folder I have a directory with already fully pre-generated HTML, generated outside Pelican by Sphinx. I would like that entire directory to be copied over to the output directory without any changes, so I listed it in the STATIC_PATHS array.

Am I using it wrong?

@justinmayer
Copy link
Member

As @wilbur-ma mentioned in #1157, try adding the following to your settings file:

READERS = {"html": None}

@liyanage
Copy link
Author

That works, thanks!

Perhaps as a feature request, it seems that this setting is global. It might be nice to be able to scope it to parts of the input tree, in case I ever want to have Pelican preprocess .html files in other parts of the source tree.

Feel free to close this if you think such a feature would complicate matters too much.

@justinmayer
Copy link
Member

I can see how some folks might have a need for that. How do you think that might be implemented? And would that be something you would want to work on?

@liyanage
Copy link
Author

Yes if I find some time and/or need this ability in the project I'm about to set up with Pelican, I'll try to add it.

Perhaps by extending the system so it accepts an array or dictionary of source path match patterns and dictionaries as value for the READER item. This would be in addition to what it accepts today, of course.

@foresto
Copy link
Contributor

foresto commented Oct 13, 2014

Would setting PAGE_EXCLUDES=['software/xxx'] and/or ARTICLE_EXCLUDES=['software/xxx'] solve this problem? (Along with STATIC_PATHS, of course.)

@MartinThoma
Copy link

I think I might have the same problem: http://stackoverflow.com/q/37104625/562769 - can anybody solve it?

@Scheirle
Copy link
Member

Scheirle commented May 9, 2016

@MartinThoma move your html5 folder in your content folder (symlink works too) and only use STATIC_PATHS = [..., 'html5']

@MartinThoma
Copy link

@Scheirle I tried it and I got

ERROR: Skipping html5/regression/regression.htm: could not find information about 'NameError: date'
ERROR: Skipping html5/regression/README.md: could not find information about 'NameError: title'

@Scheirle
Copy link
Member

Scheirle commented May 9, 2016

@MartinThoma use ARTICLE_EXCLUDES and maybe PAGE_EXCLUDES to exclude the html5 folder.

Alternatively move all your articles in a sub folder and use ARTICLE_PATHS, so pelican only looks in the given folder for articles.

@leotrs
Copy link

leotrs commented Aug 14, 2016

This is solved by setting both STATIC_PATHS and PAGE/ARTICLE_EXCLUDES at the same time.

I admit it's annoying, but at least this is no longer an issue.

@justinmayer
Copy link
Member

I'm sure this use case could be handled better, but there are at least avenues to achieve the desired result. Follow-up enhancements are of course welcome.

@mattgilbertnet
Copy link

Just ran into this issue. The fix in this thread worked, but I would think any static paths would be excluded (from pages and articles processing) by default. Being static and all.

@avaris
Copy link
Member

avaris commented Mar 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants