New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How easy to switch blog hosting services? #86

Open
scripting opened this Issue Jul 3, 2018 · 13 comments

Comments

Projects
None yet
5 participants
@scripting
Owner

scripting commented Jul 3, 2018

I posted two questions on Twitter, summarized them on my blog, and thought it would be a good idea to post them here as well, to provide a place for a more detailed examination.

Questions about wordpress.com and tumblr.com --

  1. Can you download your entire site in an open format (XML or JSON)?

  2. Can you redirect your site to a new service if you decide to move?

I want to be able to recommend hosting services.

@manton

This comment has been minimized.

manton commented Jul 3, 2018

I have some experience now with hosting blogs on Micro.blog and helping people move their sites to and from Micro.blog. You can export from WordPress.com (their XML format is based on RSS!) and import it into your own domain name on Micro.blog, for example. We also automatically download any referenced photos (since they aren't in the XML file) and redirect old URLs so links don't break.

Tumblr has their own export format which I've been meaning to add support for. I also proposed on my blog that we need a new more universal export format.

@facej

This comment has been minimized.

facej commented Jul 3, 2018

Wordpress can be moved to a different Wordpress host in a straight-forward manner - IF - the original Wordpress site is accessible. The built-in export function would appear to be complete if you choose to export "all content". Not the case. You need to export "Media" in a separate pass. That gets you an XML file that can be used to import all of the media into a different Wordpress site.

So, export "media", export "all content". On new Wordpress I would proceed with an import "media" followed by import "content".

@facej

This comment has been minimized.

facej commented Jul 3, 2018

It's kind of amusing to see an XML file with CDATA that contains JSON

@scripting

This comment has been minimized.

Owner

scripting commented Jul 3, 2018

@facej -- I think it's cool. Shows a standard that lives past what some would think was its expiration date.

@manton -- good point about downloading the images. I've seen WP criticized for not including the images. How hard is it to find them? Do you have to parse the HTML text?

@facej

This comment has been minimized.

facej commented Jul 3, 2018

I think its fantastic ;-) The "media" XML file contains all the links to the images as well as the WP metadata which is the JSON part.

@scripting

This comment has been minimized.

Owner

scripting commented Jul 3, 2018

@facej -- then the criticism is unwarranted. if the links are easy to access in the XML then what else could anyone want. Do you have an example of a small exported site in a zip file that could serve as a demo?

@facej

This comment has been minimized.

facej commented Jul 3, 2018

My example isn't actually small. I just did the "all content" export and I discover that the "media" info is actually in the all content version. All of the media have links like this

<wp:attachment_url><![CDATA[http://www.cgne-tucson.org/wp-content/uploads/2017/01/Town-Crier-2017-01.pdf]]></wp:attachment_url>

Accessing things in XML can be quite the challenge, but yeah, an XSLT transform would work, as would a fairly simple sed | awk | curl sequence

@facej

This comment has been minimized.

facej commented Jul 3, 2018

WP-export.zip

Media-only export. All-content export.

@manton

This comment has been minimized.

manton commented Jul 3, 2018

@scripting As @facej says the image information is in the XML file, but I actually parse the HTML for each post because there might be posts with images that didn't use WordPress's upload feature, and I might need to update the HTML anyway. Also if Micro.blog gets a 404 when downloading the image (because the site is no longer online), it checks the Internet Archive to see if there's a copy there.

It all "works" but having an archive format that contained the actual images would be more robust, in my opinion.

@bradbarrish

This comment has been minimized.

bradbarrish commented Jul 4, 2018

I will say that having a very large WP database can significantly complicate things in terms of switching hosts. Very recently I decided to move my blog, which has existed since 2001, from a self-hosted Wordpress install with Dreamhost to Wordpress.com. The main reason being I’m trying to reduce the number of things I’m having to maintain myself. I started that process over a month ago and am still working with the support team to get everything working right. We keep having to switch the DNS back and forth, especially due to images breaking on Wordpress.com. All that said, I certainly feel good about all the content being under my control, but it’s not easy in my experience.

@ttepasse

This comment has been minimized.

ttepasse commented Jul 4, 2018

Btw: the European Union's General Data Protection Regulation came into force in May. One of the legal rights therein is in Article 20, the right to data portability "in a structured, commonly used and machine-readable format".

That regulation applies, if the organization or the user is based in the EU. So while Wordpress and Tumblr/Yahoo are american companies, if they process data of EU citizens, they should protect the rights in the GDPR including data portability. If I were in the market for a hosting service, GDPR-compliance is something I'd look for.

@facej

This comment has been minimized.

facej commented Jul 4, 2018

@bradbarrish Interesting. I used to maintain a lot of locally-hosted WP sites. Moved to Dreamhost a few years ago to make things "simpler". I find that the self-updating WP sites at DH, along with linking all my sites up with Wordpress.com makes everything mostly background.

Yes, the DNS thing would be an issue if moving from one place to another. That was my very first comment - it is straight-forward if the "source" blog is active while moving to a "destination" blog.

I struggled with that when I moved things to Dreamhost, but figured out the "better path" for me.

Repository owner deleted a comment from donhodges Jul 4, 2018

@bradbarrish

This comment has been minimized.

bradbarrish commented Jul 4, 2018

@facej yeah, I’m getting the feeling I’ve made a mistake. Gonna give it one more round with the support people at Wordpress.com and if it doesn’t work, I’m just going to revert to DH and keep it all going from there. My main issue is the slowness of my site on the shared hosting plan I have so I may just have to pony up. DH is a great company. I have nothing but live for them. Support is second to none.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment