Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browserify for client-side build #87

Closed
RangerMauve opened this issue Aug 21, 2015 · 21 comments
Closed

Browserify for client-side build #87

RangerMauve opened this issue Aug 21, 2015 · 21 comments

Comments

@RangerMauve
Copy link

I think it'd be nice if the client side code got split up somewhat in order to have all the logical pieces in separate modules. This would also reduce the need for having the self executing function wrappers everywhere and the use of global scope. As well, this could potentially allow us to reuse a lot of the code between the browser version and the node.js version.
A potential downside is that the file size might go up a teeny bit, but that shouldn't be too much of an issue if we configure the build correctly.

Would you guys be interested in a pull request for such a change?

@RangerMauve
Copy link
Author

Potentially this would also allow for using some npm modules to replace a lot of the utilities that are currently managed within the code, which really should be handled externally so that this project can focus more on the database logic.

@amark
Copy link
Owner

amark commented Aug 21, 2015

This is where I definitely need to add way more documentation to stuff and have more comments in my code. So thank you for bringing this up.

First off, my goal is that gun core (the first closure in gun.js) should be less than a 1k LOC, or by weight under 10KB compressed & gzipped (which it currently is, it is 6KB). Everything else should be implemented in user-land (the nodejs philosophy).

Second, I personally despise compile steps so I included the "browser-side" hooks inside gun.js (the second closure, called with tab) which handles the localstorage/websocket/jsonp fallbacks. This definitely does not belong in the same file as gun core, however it is pretty tidy and small and I wanted the convenience of people being able to reference just 1 file from the browser and everything "just work" without any extra steps/configuration/compilation/grunt/gulp/whatever. This type of simplicity is incredibly important to me.

Third, gun core (the first closure in gun.js) is divided into the following:

  • Constructor
  • GUN specific Utilities
  • the Prototype Chain
  • Utilities (reusable in both browser and node, a lot of which came from my theory project)
  • the Scheduler
  • the Serializer
  • export / logger

I do agree that it would be nice if these were pieces were divided into their own files so they could be maintained more easily. However, then a build step is required and that adds more of a complexity overhead than it is personally worth for me. Especially since I'm already trying to keep the core small, so most things will be split up anyways.

Fourth, there is only one namespace that goes into the global scope which is Gun and this has to be done anyways in the browser (ultimately even browserify has to 'pollute' at least one variable into the global namespace). Is there anywhere else I'm causing harm to the global namespace? Please let me know!

Fifth, while lodash and underscore are absolutely excellent libraries (and should be used in conjunction with gun) I find it unrealistic to have gun depend upon them. To compare, lodash is 3x the size of all of gun core and underscore is about the same size as gun core. What would be really nice is that people could easily "swap" between them, but compatibility for this is super hard to achieve/support. Finally, another option is skipping utilities entirely and just using ES6 functions, but that is a no go because there are cases where enterprises require older device (and browser) support.

I know a lot of people get pretty sensitive about these subjects, so please be open to discussing them with me.

Most importantly though is the subject of this issue. Which yes, gun should work with browserify even if it does not internally use browserify. I haven't tested if gun can be built with browserify (I assume it does since it uses requires and exports). Please report if it isn't, and yes a pull request fix would be immensely appreciated. But a pull to split everything up would be a no unless there is some magical way to avoid build steps.

Thank you so much for jumping in and starting to participate, even just in the discussions!

@RangerMauve
Copy link
Author

I can see where you're coming from in terms of wanting it all in one place and avoiding build steps. Having it in one file does make it a bit harder to pick out those pieces you described, but that could probably be improved by having more or just better comments. In terms of setting the global namespace, I know a lot of projects still rely on using that for dependencies, and it's definately OK, I haven't tested this out with browserify, but it seems like it should work.

In terms of utilities I 100% agree with your stance on lodash and the such, but I was referring more to single purpose modules like is-string or arr-map which could be picked out one by one. This would eliminate the project having to have tests in place for these utilities and even having to worry about maintaining them at all (though obviously they're so simple that I doubt there would be bugs). If anything it might make sense to just require your theory library as a dependency rather than inlining it. (But that would require a build step, I suppose).

As well it might be nice to have the serializer and core logic for synchronization be available as separate libraries in the future so that other projects can make use of them.

I'm really glad to be participating since I can't seem to find anything that's doing the same thing in as simple a way.

@amark
Copy link
Owner

amark commented Aug 21, 2015

Serializer/core logic, yeah especially for when gun is ported to other languages (go, python, java...) having these cleanly delineated is important. Most of gun core (especially given our discussion on the utilities) is bogged down with just javascript junk which won't apply in other languages. Clarification: the serializer is also js specific (converting js objects into a gun graph). That means things really just get boiled down to is the sync logic. For which I need to write up a specification, which I had started but it is now lost in the wiki and outdated.

Hmm, wasn't aware that single function utilities were that popular. It makes sense though. It still seems like a catch 22 though, ultimately every higher-level logic depends upon lower level utilities like type checking. And JS has sucky type checking, so people make wrappers or polyfills. But then if you depend upon the wrapper your project won't work unless people remember to include the wrapper. BLeh.

@RangerMauve
Copy link
Author

That's where browserify is useful. If you have your dependencies defined in your package.json, then you just need one command to install them. And building the project for the browser is just another command. Browserify also has a nifty feature where you can have the require calls resolve to different files or modules when in the browser compared to the server. So you could have the main file require your server code, and for the browser you can have that resolve to an empty object. And you can have your websocket stuff resolve to either a thin wrapper over the native websocket in the browser, or something built off of ws in node. From my experience it makes development on both platforms WAAAAY nicer and the build can be easy to deal with when combined with something like watchify which makes it so you don't even have to bother with manually rebuilding while you develop.

Having the spec would be really useful since currently I can't tell how it works from the code alone and some docs would help me reason about it more and maybe contribute.

@amark
Copy link
Owner

amark commented Aug 31, 2015

These are all good points, true and true. Unfortunately though I do not like build steps, even using things like watchify, so it is highly unlikely that gun core will wind up adopting browserify. :(

However I'm not going to close this issue until there is confirmation that gun does work properly with browserify (which I assume it does).

In terms of the spec, the actual conflict resolution is in function HAM in gun core. A prose-based variation of this spec would not be much different, but would be helpful for people from an accessibility/documentation perspective.

Basically (notes to myself for when I do write the prose variation of the spec):

  • All data ever can be described in 4 simple properties:
    • a Universally Unique Identifier to contain named values and distinguish them from other data with conflicting names.
    • a Name to reference the actual value without knowing what it is in advance.
    • the Value which is an indivisible piece of information, or a UUID that points to complex information which is itself composed of these 4 properties.
    • a State to represent change or continuity between values of a name within a UUID.

Graphs accurately model these 4 properties, so I call them node, field, value, state correspondingly. The soul is a special type of UUID that "automatically" points to the current state for all values in the node. How is this done? Using a state machine operating over an open-closed boundary with the following conflict resolution rules:

  1. If the state of the value in question is above the upper boundary then computing on that value should be deferred until another state.
  2. If the state of the value in question is equal or below the upper boundary then computing on that value is valid unless:
  3. There is another known state on the name of the value in question that is above the state we are computing. Or
  4. There is another known state on the name of the value in question that is equal to the state we are computing on. Then
  5. The value of higher lexical order should be preferred.
  6. If they are of the same lexical order then values are identical other than their source.

This next part may be confusing, but it is summarizing the above: The specified algorithm guarantees the deterministic convergence of every value at the known states over every machine within the operating boundary. It however does not guarantee linearizability of states because not all states may be known during the operating boundary of the machine, thus it is eventually consistent. If linearizability must be achieved then the data itself needs to explicitly link its sequencing which can be done ontop of this specification.

@RangerMauve
Copy link
Author

How does the state get set? Is it just an incremented counter?

@amark
Copy link
Owner

amark commented Aug 31, 2015

Mmmm, no, this is where it gets kinda confusing. They are hybrid vector clock / timestamps. Aka every update just has its local timestamp (which might have drift) on it, but a larger timestamp does not mean it will "win" because it has to go through the algorithm in the previous post. An update only "wins" relative to a receiving peer's boundary function.

But yes, in gun's specific implementation the state is just a timestamp but the conflict resolution does not depend upon timestamps. You could make a variation to gun's implementation that use something else while still using the same conflict resolution algorithm. The algorithm is agnostic to how you define state, as long as it is linear. Timestamps, counters, alphnumeric are all linear. Making a variation is not recommended (and if you were to, it must necessarily go by a different name even though the algorithm is the same) because I've found timestamps to be an optimal balance.

Why? Because any state that is bound to operations on the data is too deterministic that you'll get a high amount of collision. Collision is bad because deterministic algorithms are finite in the various edge cases that they address, and the last thing you want is to be constantly hitting those limits. Timestamps are non-deterministic which is good because it reduces collision, humans have good intuitions and infrastructure to support them (even for interplanetary systems, see MTC https://en.wikipedia.org/wiki/Timekeeping_on_Mars#Coordinated_Mars_Time_.28MTC.29), and they can be easily synchronized on where infrastructure doesn't exist with minimal amount of math. Yes they have downsides, like being non-deterministic makes them inaccurate (aka clock drift) or corruptable, but those can be compensated for with the algorithm or other methods.

@RangerMauve
Copy link
Author

Does the entire value have to be replaced when receiving data with a state that would win in the algorithm? Like, is there a way to update just a portion of the JSON of the value, or should cases like that be split up into multiple nodes with relationships between them in the graph?

@amark
Copy link
Owner

amark commented Sep 1, 2015

Yes, values are atomic, the entire value is replaced. However, don't let that confuse you because you can do partial updates (https://github.com/amark/gun/wiki/Partials-and-Circular-References) on a node (a json object). So yes, you can update just a portion of the JSON, but {field: "value"} the value (a string, boolean, number, or soul) is updated in whole.

Lol, I feel like I might have just confused things more.

@RangerMauve
Copy link
Author

Ah, so does each nested field in a portion of JSON under a given key get treated as a separate piece of data with it's own name and state?

@amark
Copy link
Owner

amark commented Sep 1, 2015

Now that I am on a computer, let me illustrate by example:

{
  name: "fluffy"
  ,species: "cat"
  ,friend: {
    name: "max"
    species: "dog"
  }
}

While this looks like a JSON document, you already know that gun stores it as a graph, where Fluffy's friend field points to Max's node. So we have two nodes (Fluffy's and Max's) in the graph. We can make a key in gun to point to Fluffy directly or point to Max directly, these keys are different than Fluffy's friend field which points to Max.

We can do a partial update on either node, modifying Max like gun.put({name: "Maximus"}) the 'Maximus' value will replace the value that was there before ('max'). So values are atomic, indivisible, replaced as a whole. But the nodes can be updated partially. Does that make sense now? The resulting graph converted back into a JSON document would look like this:

{
  name: "fluffy"
  ,species: "cat"
  ,friend: {
    name: "Maximus"
    species: "dog"
  }
}

@RangerMauve
Copy link
Author

Cool, so each field and each value in the field is a separate node in the graph?

@metasean
Copy link
Collaborator

metasean commented Sep 2, 2015

@RangerMauve - Each object is a node, but not every field is an object. So in @amark's last example, there are only two nodes: fluffy and Maximus. Within fluffy's node, the friend object actually references Maximus' node, not the actual content. In otherwords, within gun, the data representation is more like:

// ASDF & SAFD represent gun souls.  
// Souls are sort of like UID plus some additional attributes, 
// but actual souls are much less coder friendly than ASDF & SAFD ;-)

{
   {
     _: {'#':'ASDF'},
     name: 'fluffy',
     species: 'cat',
     friend:  {'#':'SAFD'}
   },
   {
     _: {'#':'SAFD'},
     name: 'Maximus',
     species: 'dog'
   }
}

@amark actually has more details at https://github.com/amark/gun/wiki/JSON-Data-Format and https://github.com/amark/gun/wiki/Partials-and-Circular-References We've heard the documentation can be confusing. If you have a chance, we'd love your feedback on these!

@RangerMauve
Copy link
Author

So then when the name gets updated for SADF, does that mean that it has it's own state the got modified? Is the UUID that @amark mentioned, then the soul of the object, and the name the name of the field?

@RangerMauve
Copy link
Author

Awesome. I think I get it now!

@metasean
Copy link
Collaborator

metasean commented Sep 2, 2015

So then when the name gets updated for SADF, does that mean that it has it's own state the got modified?

Yes, every change within the node will trigger a new state. Within the node's metadata, the state's value is associated with any modified field. In the following example, an object with name: "Mark Nadal" was initially created (state 1), then it was updated to include his GitHub join date (state 2), and then his handle and status as a hacker were updated (state 3):

{
  _: {
    '#':'ASDF',
    '>': {
      joined: 2,
      hacker: 3,
      handle: 3,
      name: 1
    }
  },
  joined: 1286604000000,
  hacker: true,
  handle: 'amark'
  name: "Mark Nadal"
}

Is the UUID that @amark mentioned, then the soul of the object, and the name the name of the field?

His earlier description is a bit more generic than gun itself.

We use the soul as a node's UUID.

Name is slightly more complex. At the object level, we support 'keys' being associated with objects (https://github.com/amark/gun/wiki/JS-API#key-gunputobjectkeykey-callback-options). For clarity sake, within an object, we refer to JavaScript keys (i.e. the key in a key/value pair) as a "field" or the "fieldname". Souls, keys, and fields each allow you to, "reference the actual value without knowing what it is in advance" in different situations. At this point, we recommend users try to use keys and fields as much as possible. As we continue to develop gun, keys should become the predominant method of accessing data.

@amark
Copy link
Owner

amark commented Sep 3, 2015

thanks @metasean that was more thorough and gun implementation specific. One quick clarification is that in @metasean used an incrementing counter for simplicity sake (in the same way "ASDF" is short for simplicity sake). Those states are actually local timestamps (with potential drift) in gun's implementation (but as mentioned before, the conflict resolution algorithm uses them as boundaries, not as timestamps even though they are recorded as timestamps).

@markmarijnissen
Copy link

If you want users to drop in a file and go, why not include the build file in your repo? You can easily include a developer and minified version.

I am concerned that the specific persistence/sync implementation can be improved (i.e. socket io) while the core logic (graph creation, conflict resolution and graph traversal/manipulation) is actually the stuff I want.

@amark
Copy link
Owner

amark commented Feb 8, 2016

@markmarijnissen agreed, we can already swap in socket.io (as I mentioned) as your transport layer, you do this by making a socket.io driver for GUN.

As far as separating graph traversal/creation/resolution, agreed. We're trying to improve on this in 0.4, I'm glad your commenting about this.

What projects are you working on? Hit us up on the http://gitter.im/amark/gun , I would love to learn about them.

@amark
Copy link
Owner

amark commented Feb 20, 2017

Re: original issue, GUN v0.5.x and above now have a cool npm run unbuild step that lets you package/bundle gun however you want with webpack/browserify what have you. :) Enjoy! Closing.

@amark amark closed this as completed Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants