Convert your Blogger posts to Markdown for use with Ghost, Jekyll, and more. Created out of frustration other implementations simply don't work or have 1e25 dependencies. This implementation works and has ZERO external dependencies.
-
Backup Blogger posts to an XML file.
Instructions from https://support.google.com/blogger/answer/41387
- Sign in to Blogger.
- At the top left, select the blog you want to back up.
- In the left menu, click Settings.
- Under "Manage blog," click Back up content > Download.
-
Clone or download this repo and run
py convert_blogger_xml_to_md.py "path/to/blogger/backup.xml"
- Enjoy!
- No external dependencies
- Automatically download images from posts
- Over 25 HTML tags supported. Like
table
,img
,code
,a
,blockquote
, and more. - In-depth HTML to Markdown conversion. Including support converting nested and mixed unordered and ordered lists
- Supports emojis
- Extract posts' Author, Title, Tags, and Publish Date
The global dictionary g_converter_config
can set save paths and converter behavior. Comments next to each setting document its use.
The HTMLToMarkdownParser
class converts HTML to Markdown tag by tag. To extend or change the converter, you'll probably want to add or patch code in HTMLToMarkdownParser.handle_starttag()
and HTMLToMarkdownParser.handle_endtag()
.
-
Converting
div
,style
,iframe
,script
tagsConverting a post from HTML, CSS, and JS to Markdown, which is graphically limited by design, inherently means there is information that can't be perfectly converted. As such, html-specific tags that don't exist in Markdown are either ignored (like
div
) or copied as-is (likeiframe
,script
,style
). -
Image captions created in Blogger's editor
Blogger uses an html
table
to couple captions to images. However, this conversion script doesn't (yet) support nesting images inside tables. Therefore, after conversion, captions aren't perfectly aligned to images. -
Only tested converting my blog. This isn't battle-hardened code. Please create a Github issue if you run into any bugs.