New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CSV source #1
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hvelarde
force-pushed
the
hvelarde-cvssource
branch
7 times, most recently
from
July 29, 2015 18:50
4c7586f
to
3c151c9
Compare
hvelarde
force-pushed
the
hvelarde-cvssource
branch
3 times, most recently
from
July 30, 2015 19:21
eb66a9d
to
ea411a8
Compare
… the WXR dump, since the latter contains unprocessed Wordpress-specific markup
* add workaround for nasty lxml bug
* workaround for last-item lxml bug (fixed in lxml 3.2.2) * added new 'import-comment' boolean setting. If false (default) then no wordpress comments are imported
…rialize wordpress metadata about images, attachments and other useful stuff
…ss it on for later use * extract information about 'Image' from post metadata tags (looks like this corresponds roughly to the 'lead image' for a post). * extract information about wordpress attachments from wp:attachment_url tags and the associated post metadata tags. * extract information about disqus comment threads for posts, useful if you want to re-associate disqus threads later? add a new pipeline section which downloads 'enclosures' from the wordpress site and loads them into plone as files. Incremental improvements would be to associate enclosures with posts later via 'related items' or some similar mechanism. Also to skip downloading any enclosures where the url either is not in a whitelist of good urls or is in a blacklist of bad urls. Whitelist and blacklist to be set by configuration?
For now, do just code analysis.
Import path now uses the same structure as the WordPress site: - posts and pages are imported according to the permalink_structure - attachments are imported into the wp-content/uploads folder Also the following changes were made to the code: - Update documentation - Split code in more modules - Get categories and tags
…ere already imported
rodfersou
force-pushed
the
hvelarde-cvssource
branch
from
July 31, 2015 18:02
9980dce
to
1087608
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.