Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-1257] Decouple revision id handling #1277

Conversation

khalidhuseynov
Copy link
Contributor

What is this PR for?

This PR is to decouple Revision from RevisionId and let users to provide their own RevisionId class as a generic input to the provided RevisionId class.

What type of PR is it?

Improvement | Refactoring

Todos

  • - Separate Revision class
  • - Decouple RevisionId class as generic
  • - Update api for getNoteRevision to accept RevisionId instead of Revision

What is the Jira issue?

ZEPPELIN-1257

How should this be tested?

CI green

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? no

@khalidhuseynov
Copy link
Contributor Author

@bzz this is different approach to the problems you mentioned in #1254. Here i decouple Revision from RevisionId class and make the latter one generic so that developers can provide their own implementation of it. Please let me know whether it makes more sense.

@bzz
Copy link
Member

bzz commented Aug 4, 2016

First of all - I appreciate you patience and attention to details and your code looks great. There are few design aspects of it I wanted to discuss with you here

From the diff tab:

+116 −36 Options

This approach, namely, extracting RevisionId looks interesting, but in this particular case it brings 80 new LOC, adds extra class, uses generics, parameterized types, instance of and a wildcard (which is impressive list of language features)

If I try to understand, what are the benefits of Something<?> compared to Something, given that Something is not some kind of collection, but I'm having hard time doing it. I think there are none, and Java Language Specification, section 4.8 Raw Types support my guess.

Raw types are closely related to wildcards. Both are based on existential types. Raw types can be thought of as wildcards whose type rules are deliberately unsound, to accommodate interaction with legacy code. Historically, raw types preceded wildcards;


I think we are on the way of coming up with a better design here together, so let's try to follow Occam's razor design principle and think in term of API.

First of all, here is current problem, how I see: we are talking about design of the "Notebook Repository subsystem or API" that abstracts (hides all implementation details of) the process of getting an information about notebooks and it's lineage from the persistent storage.

This subsystem provides a public and uniform API for all it's clients to get the a notebooks, together with their history (modifications\snapshots).

NotebookRepo is part of it, it defines the semantics of the storage (list, get, etc),

Revision is another part of it that represents a point in time, selected and annotated by the user. We want NotebookRepo semantics to be aware of of such entities.

In terms of API design, first question to ask is always the same - "who are the clients"?

In our case there are:

  • NotebookServer.java
  • Notebook.java

If we agree on this general terms, I'll be happy to go on, and discuss possible ways to solve this problem. Please, et me know what you think!

@khalidhuseynov
Copy link
Contributor Author

@bzz giving some context

  1. this is not about whole NotebookRepo api, it's related to only one method here
  2. the clients identified correctly (NotebookServer and Notebook)

So the question was that given current api as follows :
get(String noteId, Revision rev, AuthenticationInfo subject)
how can we improve it? as you can see, Revision contains id as well as other constructs representing additional information about Revision, however in get we need only id.

further given your feedback from #1254 here :

Also, could you please explain why do you think this API change is valuable and what benefits does it bring for a user? (i.e does it consider other possible versioned storage i.e IPFS, BitTorrent and other future ones, where just an ID might not be enough? or it might be not String)

Now, I tried to address that problem here by adding RevisionId<?> which is used by clients (NotebookServer, Notebook) without knowledge of inner representation (placeholder ?), and only on repo level each NotebookRepo deals with his own type of RevisionId (say RevisionId<String>). The only drawback is usage of instanceof here and possible diversion of RevisionIds for different repos.

@bzz
Copy link
Member

bzz commented Aug 12, 2016

Shall we close this guy then in favor of #1254 ?

@khalidhuseynov khalidhuseynov deleted the improve/revision-id-handling branch August 12, 2016 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants