New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up the Drupal importers #235

Merged
merged 10 commits into from Jul 19, 2016

Conversation

Projects
None yet
3 participants
@borfast
Contributor

borfast commented Dec 24, 2015

Big refactoring of the Drupal importers.

All the code that was repeated on both Drupal6 and Drupal7 importers was pulled up into a generic DrupalAbstract class which implements the "core" logic of the Drupal importers and delegates the version-specific code to sub-classes.

I would appreciate it if someone could test importing non-Drupal content, though, because I don't have another blog to test the non-Drupal importers with.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 26, 2015

Member

Great start, @borfast! I would love to see this as a module instead of a superclass, or have the one class handle both options and allowing folks to specify the version in the options, maybe. What do you think about that? That gets around the sub-subclassing issue and keeps the inheritance tree very clean.

Member

parkr commented Dec 26, 2015

Great start, @borfast! I would love to see this as a module instead of a superclass, or have the one class handle both options and allowing folks to specify the version in the options, maybe. What do you think about that? That gets around the sub-subclassing issue and keeps the inheritance tree very clean.

@parkr parkr added the enhancement label Dec 26, 2015

@borfast

This comment has been minimized.

Show comment
Hide comment
@borfast

borfast Dec 26, 2015

Contributor

Thanks @parkr!

I actually thought about fixing the sub-subclassing issue by using modules instead but it turned out to cause more issues and since it required less code changes, I ended up going down the other route.

I do agree it's not pretty, though.

Having everything in one single class may not be the best option because it will get messy soon - there are already two Drupal importers and another one may soon pop up for Drupal 8. Having all these things in a single class sounds a bit too cramped.

But having a single Drupal class deriving from Importer, allowing the version to be specified which in turn dictates which specific module will be used sounds like a good idea, a much cleaner approach.

If you agree, I can take a stab at making it work that way.

Contributor

borfast commented Dec 26, 2015

Thanks @parkr!

I actually thought about fixing the sub-subclassing issue by using modules instead but it turned out to cause more issues and since it required less code changes, I ended up going down the other route.

I do agree it's not pretty, though.

Having everything in one single class may not be the best option because it will get messy soon - there are already two Drupal importers and another one may soon pop up for Drupal 8. Having all these things in a single class sounds a bit too cramped.

But having a single Drupal class deriving from Importer, allowing the version to be specified which in turn dictates which specific module will be used sounds like a good idea, a much cleaner approach.

If you agree, I can take a stab at making it work that way.

@borfast

This comment has been minimized.

Show comment
Hide comment
@borfast

borfast Apr 20, 2016

Contributor

Hey @parkr, I finally got round to making the change into a module instead of a superclass.

How does that look to you?

Contributor

borfast commented Apr 20, 2016

Hey @parkr, I finally got round to making the change into a module instead of a superclass.

How does that look to you?

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
@@ -1,139 +1,53 @@
require 'jekyll-import/importers/drupal_common.rb'

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

You can drop the .rb from this. 😄

@parkr

parkr Apr 27, 2016

Member

You can drop the .rb from this. 😄

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
c.option 'types', '--types TYPE1[,TYPE2[,TYPE3...]]', Array, 'The Drupal content types to be imported.'
def self.get_query(prefix, types)
types = types.join("' OR n.type = '")
types = "n.type = '#{types}'"

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Can you add some documentation which defines what types is? And use different variable names for these intermediate values instead of reusing types?

@parkr

parkr Apr 27, 2016

Member

Can you add some documentation which defines what types is? And use different variable names for these intermediate values instead of reusing types?

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Alternatively, construct this piece of the query in its own method that can be called on line 26 below directly.

@parkr

parkr Apr 27, 2016

Member

Alternatively, construct this piece of the query in its own method that can be called on line 26 below directly.

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
c.option 'host', '--host HOST', 'Database host name (default: "localhost")'
c.option 'prefix', '--prefix PREFIX', 'Table prefix name'
c.option 'types', '--types TYPE1[,TYPE2[,TYPE3...]]', Array, 'The Drupal content types to be imported.'
def self.get_query(prefix, types)

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

We try not to use get_ methods in Ruby; we prefer to see something like query for get_query and query= for set_query. In this case, your building a query. I'd be happy with something like build_query. 😄

@parkr

parkr Apr 27, 2016

Member

We try not to use get_ methods in Ruby; we prefer to see something like query for get_query and query= for set_query. In this case, your building a query. I'd be happy with something like build_query. 😄

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
mysql
])
def self.get_aliases_query(prefix)
return "SELECT src AS source, dst AS alias FROM #{prefix}url_alias WHERE src = ?"

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Please avoid explicit returns. Just the string will do.

@parkr

parkr Apr 27, 2016

Member

Please avoid explicit returns. Just the string will do.

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
safe_yaml
mysql
])
def self.get_aliases_query(prefix)

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Same as above about the get_ prefix. Please remove it and just call this aliases_query.

@parkr

parkr Apr 27, 2016

Member

Same as above about the get_ prefix. Please remove it and just call this aliases_query.

Show outdated Hide outdated lib/jekyll-import/importers/drupal6.rb
}.delete_if { |k,v| v.nil? || v == ''}.each_pair {
|k,v| ((v.is_a? String) ? v.force_encoding("UTF-8") : v)
}.to_yaml
def self.get_data(post)

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Same here with the get_ prefix. Please call this data, or something which describes more of its purpose, like build_post. I'd recommend also renaming the post variable to post_result or something that indicates that it is from SQL.

@parkr

parkr Apr 27, 2016

Member

Same here with the get_ prefix. Please call this data, or something which describes more of its purpose, like build_post. I'd recommend also renaming the post variable to post_result or something that indicates that it is from SQL.

Show outdated Hide outdated lib/jekyll-import/importers/drupal7.rb
@@ -1,111 +1,54 @@
require 'jekyll-import/importers/drupal_common.rb'

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Same here with dropping the .rb prefix.

@parkr

parkr Apr 27, 2016

Member

Same here with dropping the .rb prefix.

end
data = {
'excerpt' => summary,
'categories' => tags.split('|')

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Why are the tags being assigned to the Jekyll post's categories flag? Shouldn't they remain tags?

@parkr

parkr Apr 27, 2016

Member

Why are the tags being assigned to the Jekyll post's categories flag? Shouldn't they remain tags?

This comment has been minimized.

@parkr

parkr Jul 15, 2016

Member

@borfast Would you mind answering this question for me? May just be a lexicographical difference between Jekyll and Drupal. 😄

@parkr

parkr Jul 15, 2016

Member

@borfast Would you mind answering this question for me? May just be a lexicographical difference between Jekyll and Drupal. 😄

This comment has been minimized.

@borfast

borfast Jul 18, 2016

Contributor

@parkr, Drupal doesn't really distinguish between tags and categories, they're all "terms" in a "vocabulary". You can create multiple vocabularies and call them whatever you want, so if you feel like it, you can create a "Categories" vocabulary and a "Tags" vocabulary, each with its own set of terms that you can assign to each post.

The problem is, there's really no way of telling which vocabulary (if any) the author wants to use as the source for categories or tags because there's no hard-coded distinction between those two concepts. They're just "words" you assign to each post and they're grouped together under one or more collection of words.

The consequence is that it makes more sense to just assign everything to either categories or posts, instead of trying to guess which is which.

I suppose an option could be added to the importer to allow the user to specify which vocabularies should be used for tags and categories.

@borfast

borfast Jul 18, 2016

Contributor

@parkr, Drupal doesn't really distinguish between tags and categories, they're all "terms" in a "vocabulary". You can create multiple vocabularies and call them whatever you want, so if you feel like it, you can create a "Categories" vocabulary and a "Tags" vocabulary, each with its own set of terms that you can assign to each post.

The problem is, there's really no way of telling which vocabulary (if any) the author wants to use as the source for categories or tags because there's no hard-coded distinction between those two concepts. They're just "words" you assign to each post and they're grouped together under one or more collection of words.

The consequence is that it makes more sense to just assign everything to either categories or posts, instead of trying to guess which is which.

I suppose an option could be added to the importer to allow the user to specify which vocabularies should be used for tags and categories.

Show outdated Hide outdated lib/jekyll-import/importers/drupal7.rb
safe_yaml
])
def self.get_aliases_query(prefix)
return "SELECT source, alias FROM #{prefix}url_alias WHERE source = ?"

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

My comments from the drupal6.rb file also apply to this file 😄

@parkr

parkr Apr 27, 2016

Member

My comments from the drupal6.rb file also apply to this file 😄

Show outdated Hide outdated lib/jekyll-import/importers/drupal_common.rb
sequel
fileutils
safe_yaml
])

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Would you please fix the indentation here? Two spaces for each indentation, and rubygems and the rest of the list should be indented from JekyllImport.

@parkr

parkr Apr 27, 2016

Member

Would you please fix the indentation here? Two spaces for each indentation, and rubygems and the rest of the list should be indented from JekyllImport.

c.option 'password', '--password PW', "Database user's password (default: '')"
c.option 'host', '--host HOST', 'Database host name (default: "localhost")'
c.option 'prefix', '--prefix PREFIX', 'Table prefix name'
c.option 'types', '--types TYPE1[,TYPE2[,TYPE3...]]', Array, 'The Drupal content types to be imported.'

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Could you add the list of defaults here?

@parkr

parkr Apr 27, 2016

Member

Could you add the list of defaults here?

c.option 'dbname', '--dbname DB', 'Database name'
c.option 'user', '--user USER', 'Database user name'
c.option 'password', '--password PW', "Database user's password (default: '')"
c.option 'host', '--host HOST', 'Database host name (default: "localhost")'

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

It might make sense to specify these defaults as constants in the class and refer to them here and in process.

@parkr

parkr Apr 27, 2016

Member

It might make sense to specify these defaults as constants in the class and refer to them here and in process.

Show outdated Hide outdated lib/jekyll-import/importers/drupal_common.rb
db = Sequel.mysql(dbname, :user => user, :password => pass, :host => host, :encoding => 'utf8')
query = self.get_query(prefix, types)

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

This self. shouldn't be necessary.

@parkr

parkr Apr 27, 2016

Member

This self. shouldn't be necessary.

dirs = {
:_posts => File.join(src_dir, '_posts').to_s,
:_drafts => File.join(src_dir, '_drafts').to_s,
:_layouts => File.join(src_dir, '_layouts').to_s

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

If you're checking the configuration, it would be fruitful to check the value of layouts_dir here. It should default to <source>/_layouts but can be modified by the user.

@parkr

parkr Apr 27, 2016

Member

If you're checking the configuration, it would be fruitful to check the value of layouts_dir here. It should default to <source>/_layouts but can be modified by the user.

Show outdated Hide outdated lib/jekyll-import/importers/drupal_common.rb
# Create the refresh layout
# Change the refresh url if you customized your permalink config
File.open(File.join(dirs[:_layouts], 'refresh.html'), 'w') do |f|
f.puts <<EOF

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Please change this to <<-HTML.

@parkr

parkr Apr 27, 2016

Member

Please change this to <<-HTML.

Show outdated Hide outdated lib/jekyll-import/importers/drupal_common.rb
<meta http-equiv="refresh" content="0;url={{ page.refresh_to_post_id }}.html" />
</head>
</html>
EOF

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Please change this from EOF to HTML. :)

@parkr

parkr Apr 27, 2016

Member

Please change this from EOF to HTML. :)

Show outdated Hide outdated lib/jekyll-import/importers/drupal_common.rb
db[query].each do |post|
# Get required fields
data, content = self.get_data(post)

This comment has been minimized.

@parkr

parkr Apr 27, 2016

Member

Can the contents of this each block please be pulled out into its own method and split up into sub methods if necessary? It's huge!

@parkr

parkr Apr 27, 2016

Member

Can the contents of this each block please be pulled out into its own method and split up into sub methods if necessary? It's huge!

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Apr 27, 2016

Member

@borfast Excellent, thank you! This is a terrific start. Just a few comments above. 😄

Are you able to write tests for these as well?

Member

parkr commented Apr 27, 2016

@borfast Excellent, thank you! This is a terrific start. Just a few comments above. 😄

Are you able to write tests for these as well?

@borfast

This comment has been minimized.

Show comment
Hide comment
@borfast

borfast Apr 30, 2016

Contributor

Sure thing, @parkr, I'll take care of these things as soon as possible. You can tell I don't use Ruby much :)

Contributor

borfast commented Apr 30, 2016

Sure thing, @parkr, I'll take care of these things as soon as possible. You can tell I don't use Ruby much :)

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr May 12, 2016

Member

@borfast Your Ruby is great! Much better than what I was writing before I was using it daily! 😄

Member

parkr commented May 12, 2016

@borfast Your Ruby is great! Much better than what I was writing before I was using it daily! 😄

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jul 14, 2016

Member

@borfast Want to patch the above or do you want me to do it?

Member

parkr commented Jul 14, 2016

@borfast Want to patch the above or do you want me to do it?

@borfast

This comment has been minimized.

Show comment
Hide comment
@borfast

borfast Jul 14, 2016

Contributor

@parkr, sorry for taking so long, life hasn't been kind to me lately and I fell behind on several things.

I had already made a few of the easiest changes, which I just pushed in one commit. I'm not sure I'll have time for the rest in the next week, so if you want to finish up the rest of the stuff, go ahead.

Contributor

borfast commented Jul 14, 2016

@parkr, sorry for taking so long, life hasn't been kind to me lately and I fell behind on several things.

I had already made a few of the easiest changes, which I just pushed in one commit. I'm not sure I'll have time for the rest in the next week, so if you want to finish up the rest of the stuff, go ahead.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jul 19, 2016

Member

LGTM!

Member

parkr commented Jul 19, 2016

LGTM!

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jul 19, 2016

Member

@borfast I have some incoming changes if you don't mind taking a look. They'll be in a follow-up PR.

@jekyllbot: merge

Member

parkr commented Jul 19, 2016

@borfast I have some incoming changes if you don't mind taking a look. They'll be in a follow-up PR.

@jekyllbot: merge

@jekyllbot jekyllbot merged commit d8af05e into jekyll:master Jul 19, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
jekyll/lgtm Approved by @parkr.

jekyllbot added a commit that referenced this pull request Jul 19, 2016

parkr added a commit that referenced this pull request Jul 19, 2016

parkr added a commit that referenced this pull request Jul 19, 2016

@borfast borfast deleted the borfast:generic-drupal branch Jul 19, 2016

@borfast

This comment has been minimized.

Show comment
Hide comment
@borfast

borfast Jul 19, 2016

Contributor

Just replied on the other PR. Thanks for merging this one, @parkr :)

I have a few more ideas for improvements on the Drupal importer. I'll give them a shot when I have a bit more time.

Contributor

borfast commented Jul 19, 2016

Just replied on the other PR. Thanks for merging this one, @parkr :)

I have a few more ideas for improvements on the Drupal importer. I'll give them a shot when I have a bit more time.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jul 19, 2016

Member

I have a few more ideas for improvements on the Drupal importer. I'll give them a shot when I have a bit more time.

I will gladly review! Now that this monster is merged, merges should be quicker too 😄

Member

parkr commented Jul 19, 2016

I have a few more ideas for improvements on the Drupal importer. I'll give them a shot when I have a bit more time.

I will gladly review! Now that this monster is merged, merges should be quicker too 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment