Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse hosted files migration tools #2

Merged
merged 12 commits into from
Jul 19, 2016

Conversation

JeremyPlease
Copy link
Member

@JeremyPlease JeremyPlease commented Jul 18, 2016

This pull request includes quite a bit of changes to resolve parse-community/parse-server#8. The README probably covers functionality the best. But to sum up I've added the following functionality:

  1. Prompt for information or read a configuration file.
  2. Only process Parse hosted files, Parse Server hosted files, or all files.
  3. Rename files so they don't start with "tfss-" or match Parse legacy file pattern.
  4. Transfer files using a Parse Server file adapter (and log progress/errors).
  5. Rename files in MongoDB.

There are no tests written, but I did test across a couple Parse apps using each of the storage providers without any issues. Let me know if you have any questions or suggestions.

… rename

- Comprehensive prompt system to collect all information
- Or specify all options in a config file
- Options to transfer only Parse hosted files, Parse Server files, or all files
- Processes 5 files at a time
- Logs progress and errors to console
@@ -19,6 +72,10 @@ function onlyFiles(schemas) {
function getAllObjects(baseQuery) {
var allObjects = [];
var next = function(startIndex) {
if (startIndex > 10000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate on that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should use a variable for this: (var maxParseSkip = 10000)

The purpose of this is to work around Parse's limitation of skipping more than 10k. If you call query.skip(10001) you'll get a response of {"code":154,"error":"Skips larger than 10000 are not allowed"}.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh very good call.
Instead should we order by createdAt and use the last object all the times? 1 logic instead of 2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Done!

@flovilmart
Copy link
Contributor

Hey @JeremyPlease that is awesome 👏 🎉 !!!! Really many thanks!

Just one thing, I was thinking of leveraging the parse-server-files adapters as they are already implemented and all share the same API.

That would greatly reduce the code footprint here, and let any user pass it's adapter configured from the config.js

What do you think?

@JeremyPlease
Copy link
Member Author

Thanks @flovilmart. Happy to help!

Making use of the parse-server-files adapters makes a lot of sense.

I'd like to keep the ability to run the entire file migration through the command line without any configuration file. So, I'll likely check for a filesAdapter option in the configuration file and make use of the passed in adapter's createFile function. If filesAdapter option is not present, then I'll load the appropriate adapter based on command prompt responses.

I'll make the changes and update the PR in the morning.

@flovilmart
Copy link
Contributor

@JeremyPlease Thank you for the help!

As you mentioned the node index.jsconfig.js I was thinking you wanted to use that.
let's do the basic ones from the CLI and then let the option to pass another one from config.js.

What do you think?

No need to handle skip > 10k separately now.
var next = function(startIndex) {
if (startIndex > 10000) {
var next = function() {
if (allObjects.length) {
baseQuery.greaterThan('createdAt', allObjects[allObjects.length-1].createdAt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should order on createdAt and force the greaterThan no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do 😃 The query is created in getObjectsWithFilesFromSchema with query.ascending('createdAt') on line 93.

Copy link
Contributor

@flovilmart flovilmart Jul 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@flovilmart
Copy link
Contributor

now if we can just refactor transfer.js with the file adapters, that would just rock and I'll merge/ release!

- File adapter can be passed in through configuration file.
- If no filesAdapter specified will load fs, s3, or gcs adapter based on prompts.
- Also updated callback based functions to return a promise.
@JeremyPlease
Copy link
Member Author

@flovilmart Should be good to go now!

If you can, please test against some real data. You can set renameInDatabase to false so that the database isn't actually updated.

@flovilmart
Copy link
Contributor

Can we set to dry run by default, and only write to the DB when we're done?

@flovilmart
Copy link
Contributor

flovilmart commented Jul 19, 2016

Awesome, I'm tempted to merge that, and then build a fake app with bunch of images?

Or even better, add unit tests that would create a bunch of Files on parse.com. and get them no?

@acinader
Copy link
Member

i'll be able to test this in our dev env. i'll try to get it going this afternoon....

@JeremyPlease
Copy link
Member Author

@flovilmart Go for it! I imagine there will be some additional kinks to work out, but everything is working well on the apps that I've tested against.

@davimacedo
Copy link

I am following the thread the whole day :)
Also available and excited to test.

@flovilmart
Copy link
Contributor

The code look neat, let's merge that, it'S not officially supported anyway, just a cool hack from you @JeremyPlease Thanks!!!

@flovilmart flovilmart merged commit 466bb29 into parse-server-modules:master Jul 19, 2016
@@ -31,3 +31,5 @@ node_modules

# Optional REPL history
.node_repl_history

.DS_Store
Copy link
Member

@acinader acinader Jul 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my $.02 is that this should go in ~/.gitignore_global (ymmv)

and there should be a cr at the end of this file.

@JeremyPlease
Copy link
Member Author

Nice! Thanks @flovilmart

Anyone trying this out, please be sure to read the README and report any issues.

@flovilmart
Copy link
Contributor

We should put a disclaimer in the README first

@acinader
Copy link
Member

@JeremyPlease the readme says:

"As soon as this script transfers files away from Parse.com hosted files (and renames them in
 the database) any clients that use api.parse.com will no longer be able to access the files. See
 the section titled "5. Files" in the Parse Migration Guide and Parse Server issue #1582."

So if understand this right, your pr enables us to copy all of the files out of the parse.com filespace and into our own, but doesn't address my take on #8 which is to allow parse.com and parse-server clients to both work with pre/post parse-server migration files? I'm reading again....

@JeremyPlease
Copy link
Member Author

@flovilmart Got any standard disclaimer test you want to drop into the README or maybe something like MongoDB support-tools disclaimer?


@acinader You understood correctly. You can read the discussion in parse-community/parse-server#1582 (comment).

Essentially, when a file is hosted with the filesAdapter of Parse Server, the servers at api.parse.com have no way of knowing where the file URL is and just sets it to "files.parsetfss.com/...". The only way to have forwards compatibility would be if changes were made to both open source Parse Server and api.parse.com. Unfortunately, a change to api.parse.com is highly unlikely.

I would recommend not migrating your files until you no longer care about supporting clients using api.parse.com.

@acinader
Copy link
Member

first off, @JeremyPlease thank you! been very useful reading your comments as well as this pr. I have a followup question for you, which i'll post to you in parse-community/parse-server#1582 (comment)

@flovilmart
Copy link
Contributor

@JeremyPlease the mongo one is pretty cool, I'm gonna copy paste it

@barnaclejive
Copy link

@JeremyPlease Thanks! Maybe this should be obvious, but can you elaborate a little on the serverURL option?

Under what circumstances would this be api.parse.com/1 vs an instance of self-hosted parse-server. The documentation is little confusing because it says it defaults to http://api.parse.com/1 but in the config.example.js the example url looks more like a self-hosted parse-server url. If it is a self-hosted serse-server, then what function does the mongoURL play (wouldn't connecting to the parse-server dictate the mongo db?). Are there certain combinations of the options that don't make sense? It isn't clear without a little more detail about how each option impacts the script.

In general, I think some quick explanations would be helpful about how the migration works given various combinations of the serverURL, filesToTransfer, and mongoURL options. The two most common scenarios would probably be "pre-migration" and "post-migration" strategies. Ideally, this would all be covered in detail within Step 5 of the Migration Guide

Thanks again for all the work, this is a key milestone towards pulling the trigger on migration.

@JeremyPlease
Copy link
Member Author

@barnaclejive I've updated the configuration options section or the README.

The configuration options are there to provide flexibility. For migrating files from parse.com to your own file adapter, I would recommend waiting for all clients to be pointed at your new Parse Server and then running with a config.js like this one:

module.exports = {
  applicationId: "PARSE_SERVER_APPLICATION_ID",
  masterKey: "PARSE_SERVER_MASTER_KEY",
  serverURL: "https://api.customparseserver.com/parse", // your Parse Server url
  renameInDatabase: true,
  mongoURL: "mongodb://<username>:<password>@mongourl.com:27017/database_name",
  filesToTransfer: 'parseOnly',
  filesAdapter: yourParseFilesAdapter
};

This will transfer parse.com hosted files with your filesAdapter and rename the files in MongoDB so that Parse Server sets the file url properly.

@flovilmart
Copy link
Contributor

Also note that it won't migrate files that are in arrays or files that are referenced in nested objects.

@jnoreika
Copy link

@flovilmart Is another tool or suggestion on how to migrate files that are in arrays and/or nested objects?

@flovilmart
Copy link
Contributor

We could add the support for arrays and nested objects in that tool, but that would make it way less efficient, probably as an option.

@flovilmart
Copy link
Contributor

Yeah, that's quite small you're right! That would be a welcomed addition :)

@flovilmart
Copy link
Contributor

as well as some unit tests

@jnoreika
Copy link

@acinader I was concerned they wouldn't be covered due to @flovilmart comment on Jul 19.

I do have objects such as he mentions that have array field types and those arrays in turn have nested objects that refer to files like in the json below:

[{
"sectionData": [{
"category": "Plants",
"images": [{
"__type": "File",
"name": "tfss-file-name-UUID-here-AnImage1.jpeg",
"url": "http://files.parsetfss.com/app-file-key-here/tfss-file-name-UUID-here-AnImage1.jpeg"
}]
}]
}]

@acinader
Copy link
Member

sorry guys, I deleted my comment a) cause I realized what you meant by nested object and b) realized that you can't determine by just looking at the schema, so it is more complicated to figure out what heuristic to use (cause you clearly don't want to check all records to see, some reasonable sample would need to be used) to determine what array and/or object fields might have a file :).

@flovilmart
Copy link
Contributor

That is effectively not covered :) but that should not be that tough to implement as it's a fully recursive search :)

@flovilmart
Copy link
Contributor

The nested object capability can be on by default, and to speed up the process, one could pass a --no-nested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants