This project takes a JSON archive scraped from /r/GenZhou (before it was banned) and translates each post into a PL/pgSQL script which will insert the data into a Lemmy database for rehosting elsewhere.
You can download the generated SQL script from the repo's releases page and run it directly. It's hardcoded to add posts to c/genzhouarchive under the user u/archive_bot, both of which must already exist before running the script.
If you want to modify the default community name or user name, you're going to have to run the code to generate the SQL script yourself. This is a good idea anyway since it's not wise to trust me to run arbitrary queries on your database, and in this case the easiest way to review them is to review the code that generates them.
Prerequisites: Java 8 or above
Download the jar file from the releases page and run it:
java -jar genZhouImporter-0.2-SNAPSHOT.jar genzhouarchive archive_bot
The output file GenZhouArchive.sql
will be placed in your current working directory.
Prerequisites: JDK >=1.8, Maven 3.
Clone the repo and cd to the GenZhouImporter directory. Run:
mvn compile
mvn exec:java -Dexec.args="genzhouarchive archive_bot"
(This will pull down dependencies from Maven Central so you must be connected to the internet during the compile step.)
The output file GenZhouArchive.sql
will be placed in your current working directory.
(Note that this uses the default values for the database name and database username. If you've changed them in your Lemmy configuration then update the values accordingly.)
Copy GenZhouArchive.sql
to the server running Postgres and run this:
psql --dbname=lemmy --username=lemmy --file=GenZhouArchive.sql
(Note that this uses the default values for the database name and database username. If you've changed them in your Lemmy configuration then update the values accordingly.)
Copy GenZhouArchive.sql
to the server running Docker and run this:
docker cp ./GenZhouArchive.sql $(docker ps -qf name=postgres):/var/lib/postgresql
docker exec -it $(docker ps -qf name=postgres) psql --dbname=lemmy --username=lemmy --file=/var/lib/postgresql/GenZhouArchive.sql
The input data was scraped by @DongFangHong@lemmygrad.ml
probably some time on 30 March 2022. (For reference, the subreddit was banned on 4 April.)