Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-3758]. Convert old note file note.json to new style #3189

Closed
wants to merge 1 commit into from

Conversation

zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Sep 26, 2018

What is this PR for?

This PR is follow up of ZEPPELIN-261, it is to convert old note file note.json to new style when user upgrading zeppelin before 0.9. 2 properties are introduced.

  • zeppelin.notebook.new_format.convert, by default it is false. When enabled, zeppelin will first find all the old note file note.json and then convert it into new style.
  • zeppelin.notebook.new_format.delete_old, by default it is false, when setting true, zeppelin will delete the old note file after converting it into new style.

What type of PR is it?

[Improvement]

Todos

  • - Task

What is the Jira issue?

How should this be tested?

  • CI pass

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@prabhjyotsingh
Copy link
Contributor

Also, since we are doing this; should there be a command line utility as well? which on run from command-line can convert from old to new format.

@zjffdu
Copy link
Contributor Author

zjffdu commented Sep 26, 2018

Good idea, I can do it in a follow up PR.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems rather heavy? so many classes just to convert note

@zjffdu
Copy link
Contributor Author

zjffdu commented Sep 29, 2018

Many classes are old existing code which I just rename them. e.g. OldAzureNotebookRepo is renamed from the old version of AzureNotebookRepo. The most important change is in NotebookRepoSync

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed NotebookRepoSync

OldNotebookRepo oldNotebookRepo =
PluginManager.get().loadOldNotebookRepo(newNotebookRepo.getClass().getCanonicalName());
oldNotebookRepo.init(conf);
List<OldNoteInfo> oldNotesInfo = oldNotebookRepo.list(AuthenticationInfo.ANONYMOUS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would accessing as anonymous here causes some old note to be skipped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, anonymous can access all notes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that sounds a bit counter - intuitive..

note.setPath(note.getName());
newNotebookRepo.save(note, AuthenticationInfo.ANONYMOUS);
if (deleteOld) {
oldNotebookRepo.remove(note.getId(), AuthenticationInfo.ANONYMOUS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log when deleting?

@@ -732,6 +732,8 @@ public String getZeppelinSearchTempPath() {
ZEPPELIN_NOTEBOOK_STORAGE("zeppelin.notebook.storage",
"org.apache.zeppelin.notebook.repo.GitNotebookRepo"),
ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC("zeppelin.notebook.one.way.sync", false),
ZEPPELIN_NOTEBOOK_NEW_FORMAT_CONVERT("zeppelin.notebook.new_format.convert", false),
ZEPPELIN_NOTEBOOK_NEW_FORMAT_DELETE_OLD("zeppelin.notebook.new_format.delete_old", false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If zeppelin.notebook.new_format.convert=true and zeppelin.notebook.new_format.delete_old=false, would Zeppelin try to convert same old notes many times, after each restart?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, user need to set zeppelin.notebook.new_format.convert to false when converting is done. so that next time zeppelin won't do the conversion again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user forgot to change it back, every restart would convert an old version of an old format (json) and overwrites it on a new version of a newer version note. I would go with the script, but it's not easy when you have 100 notes in HDFS :(


@Override
public void save(final Note note, AuthenticationInfo subject) throws IOException {
this.fs.writeFile(note.toJson(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It better be two-phase write approach.
Write to a temp file first, and then rename atomically.
Otherwise we may end up with partial writes, and lost note's content completely (in case if e.g. a file system was full).
See for example https://issues.apache.org/jira/browse/ZEPPELIN-3467

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asfgit asfgit closed this in fa271b9 Oct 10, 2018
@maziyarpanahi
Copy link
Contributor

Does this work out of the box if the notes are in HDFS? I am facing a similar situation which I am guessing is related to this issue:
https://jira.apache.org/jira/projects/ZEPPELIN/issues/ZEPPELIN-3987

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants