Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve blogger directory structure #157

Merged
merged 10 commits into from Jul 25, 2016
Merged

Conversation

@ChillarAnand
Copy link
Contributor

ChillarAnand commented Jul 24, 2016

No description provided.

self.write_content(
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/')

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

Are there any other URL structures possible with Blogger? This looks like a hack nevertheless. Also, the slug is very likely to still be incorrect (YYYYMMrest-of-slug)

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 24, 2016

Author Contributor

This is default structure. Not sure if any other structures are possible.

regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/(?P<slug>.*)')
match = regex.search(link_path)
if match:
year, month, slug = match.group('year'), match.group('month'), match.group('slug')

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

You don’t need to name your groups in this case and you could just use match.groups()

Please check if there are other structures available — and perhaps refrain from using regular expressions, instead basing yourself on splitting on /?

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 24, 2016

Author Contributor

Using split seems much simpler. Blogger allows custom slugs but url structure is fixed.

os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
link_strings = link_path.split('/')
if len(link_strings) == 3:

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

This is cleaner:

link_fragments = link_path.split('/')
slug = link_fragments[-1]
out_path = os.path.join(self.output_folder, out_folder, link_path)

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 24, 2016

Author Contributor

yes please

@ChillarAnand ChillarAnand force-pushed the ChillarAnand:master branch 2 times, most recently from f144d91 to 4251d62 Jul 24, 2016
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
link_fragments = link_path.split('/')
if len(link_fragments) == 4:

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member
  • this check is unnecessary; however, you should have slug pre-determined and set to the correct value much earlier (line 161)
  • remove the unnecessary re import

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 24, 2016

Author Contributor

What about drafts?

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

Detecting drafts does not depend on this as far as I can see. You need to write the correct path to the URL map, for example.

@ChillarAnand ChillarAnand force-pushed the ChillarAnand:master branch from 4251d62 to f37a5b1 Jul 24, 2016
@ChillarAnand ChillarAnand force-pushed the ChillarAnand:master branch from f37a5b1 to 6892f22 Jul 24, 2016
self.write_content(
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
if is_draft:

This comment has been minimized.

Copy link
@Kwpolska
@@ -154,8 +156,11 @@ def import_item(self, item, out_folder=None):

if link_path.lower().endswith('.html'):
link_path = link_path[:-5]
link_path = link_path.lstrip('/')

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

Do this unconditionally.

@Kwpolska
Copy link
Member

Kwpolska commented Jul 24, 2016

LGTM, but I’d like a second opinion just in case (we went through so many changes here.) @punchagan , @ralsina?

(thanks for listening to all my complaints. In the future, it might be better not to amend and force-push)

@@ -182,21 +187,19 @@ def import_item(self, item, out_folder=None):
else:
is_draft = False

self.url_map[link] = self.context['SITE_URL'] + '/' + \
out_folder + '/' + slug + '.html'
self.url_map[link] = os.path.join(

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

You can’t use os.path.join for URLs (that would give backslashes on Windows), just use slashes manually or do os.path.join().replace(os.sep, '/')

out_folder + '/' + slug + '.html'

self.url_map[link] = self.context['SITE_URL'] + out_folder + \
'/' + link_path + '.html'

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Jul 24, 2016

Member

better style (flake8 complains about indent):

        self.url_map[link] = (self.context['SITE_URL'] + out_folder +
                              '/' + link_path + '.html')
self.url_map[link] = self.context['SITE_URL'] + '/' + \
out_folder + '/' + slug + '.html'

self.url_map[link] = (self.context['SITE_URL'] + out_folder +

This comment has been minimized.

Copy link
@punchagan

punchagan Jul 25, 2016

Member

It might be safer to ensure that SITE_URL ends with a '/' when populating the context. Some blogs probably don't return a url ending in '/' and that may be why we had it in our code.

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 25, 2016

Author Contributor

👍

slug = utils.slugify(link_path)
out_path = os.path.join(self.output_folder, out_folder, link_path)
link_fragments = link_path.split('/')
slug = utils.slugify(link_fragments[-1])

This comment has been minimized.

Copy link
@punchagan

punchagan Jul 25, 2016

Member

Are blogger slugs unique? For example, Could there be posts on different dates /2016/06/auto-completion-for-custom-search and /2014/03/auto-completion-for-custom-search, which would cause slug clashes?

This comment has been minimized.

Copy link
@ChillarAnand

ChillarAnand Jul 25, 2016

Author Contributor

Wont they go to different folders?

This comment has been minimized.

Copy link
@punchagan

punchagan Jul 25, 2016

Member

Right. I was thinking Nikola needed to have unique slugs, but that's not the case. This should be fine.

@Kwpolska
Copy link
Member

Kwpolska commented Jul 25, 2016

Merging, thanks for your contribution!

@Kwpolska Kwpolska merged commit ee13131 into getnikola:master Jul 25, 2016
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.