Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract dependency free API and tests #71

Closed
wants to merge 21 commits into from
Closed

Conversation

goofballLogic
Copy link
Member

No description provided.

@goofballLogic goofballLogic changed the title wip Extract dependency free API and tests Jul 30, 2020
@goofballLogic goofballLogic linked an issue Jul 30, 2020 that may be closed by this pull request
@goofballLogic goofballLogic mentioned this pull request Aug 19, 2020
@goofballLogic
Copy link
Member Author

goofballLogic commented Aug 19, 2020

@asbjornu would appreciate initial feedback on direction here. Basically the approach has been to introduce a new Raw namespace at both JsonLD.Core.Raw and JsonLD.Test.Raw which duplicates the JsonLdProcessor and associated classes but removes the (external) dependency on Newtonsoft JSON - instead JSON is passed in as raw strings. Similarly tests use raw strings (for now parsed using a drop in a lightweight single-file JSON parser).

At the moment, only Conformance tests are implemented - there are others which need to be completed, but I wanted to stop at this point and see if you're comfortable with the overall direction.

If so, next steps would be:

  1. Complete unit test implementation for Newtonsoft-free interface
  2. Deal with the question of https://github.com/linked-data-dotnet/json-ld.net/pull/71/files#diff-4d8cb58a0d7bf5ab2667fade0139e479R52
  3. Merge back to master
  4. I've noticed that some unit tests are internally throwing errors (which are then caught and silently ignored 🤕). Would be good to go back and fix this unfortunate state of affairs before proceeding.
  5. Layout and white-space are fairly non-standard and generally horrific in the codebase. I'd like to clean that up before proceeding. (Dealing with other horrible things in the code should be deferred)
  6. Propose minor release adding the string-based interface (as an alternative to the default Newtonsoft one)

@asbjornu
Copy link
Member

Thanks for taking on this, @goofballLogic!

@asbjornu would appreciate initial feedback on direction here. Basically the approach has been to introduce a new Raw namespace at both JsonLD.Core.Raw and JsonLD.Test.Raw which duplicates the JsonLdProcessor and associated classes but removes the (external) dependency on Newtonsoft JSON - instead JSON is passed in as raw strings. Similarly tests use raw strings (for now parsed using a drop in a lightweight single-file JSON parser).

I'm not quite sure about the architectural direction taken here. If I were to tackle this myself, I would probably introduce an intermediary object model in the core namespace that knew nothing about JSON or parsing, that could then be built as a result from a parser using different engines in projects such as JsonLDNet.Newtonsoft and JsonLDNet.Json or something similar.

I thus feel that having a Raw namespace within the core may be the wrong approach, but I'm not familiar enough with the codebase to be certain about it.

  1. Complete unit test implementation for Newtonsoft-free interface
  2. Deal with the question of https://github.com/linked-data-dotnet/json-ld.net/pull/71/files#diff-4d8cb58a0d7bf5ab2667fade0139e479R52
  3. Merge back to master
  4. I've noticed that some unit tests are internally throwing errors (which are then caught and silently ignored 🤕). Would be good to go back and fix this unfortunate state of affairs before proceeding.
  5. Layout and white-space are fairly non-standard and generally horrific in the codebase. I'd like to clean that up before proceeding. (Dealing with other horrible things in the code should be deferred)
  6. Propose minor release adding the string-based interface (as an alternative to the default Newtonsoft one)

I think this TODO-list is reasonable. 👍

@goofballLogic
Copy link
Member Author

Thanks for taking on this, @goofballLogic!

@asbjornu would appreciate initial feedback on direction here. Basically the approach has been to introduce a new Raw namespace at both JsonLD.Core.Raw and JsonLD.Test.Raw which duplicates the JsonLdProcessor and associated classes but removes the (external) dependency on Newtonsoft JSON - instead JSON is passed in as raw strings. Similarly tests use raw strings (for now parsed using a drop in a lightweight single-file JSON parser).

I'm not quite sure about the architectural direction taken here. If I were to tackle this myself, I would probably introduce an intermediary object model in the core namespace that knew nothing about JSON or parsing, that could then be built as a result from a parser using different engines in projects such as JsonLDNet.Newtonsoft and JsonLDNet.Json or something similar.

I thus feel that having a Raw namespace within the core may be the wrong approach, but I'm not familiar enough with the codebase to be certain about it.

  1. Complete unit test implementation for Newtonsoft-free interface
  2. Deal with the question of https://github.com/linked-data-dotnet/json-ld.net/pull/71/files#diff-4d8cb58a0d7bf5ab2667fade0139e479R52
  3. Merge back to master
  4. I've noticed that some unit tests are internally throwing errors (which are then caught and silently ignored 🤕). Would be good to go back and fix this unfortunate state of affairs before proceeding.
  5. Layout and white-space are fairly non-standard and generally horrific in the codebase. I'd like to clean that up before proceeding. (Dealing with other horrible things in the code should be deferred)
  6. Propose minor release adding the string-based interface (as an alternative to the default Newtonsoft one)

I think this TODO-list is reasonable. 👍

Just to clarify am I correctly understanding:

  1. my existing design is to use string-based JSON manipulation as the lowest-level abstraction
  2. yours is to use a completely abstract lowest-level abstraction
    ?

@goofballLogic
Copy link
Member Author

goofballLogic commented Aug 31, 2020

To help elucidate the above question @asbjornu , the intent was to essentially rewrite by refactoring (which I think you preferred on the whole). The idea of extracting a .Raw namespace was to establish a dependency free API (string-based, for internal use) which would adhere very closely to the existing Newtonsoft API.

 ------------------         ------------------
| JToken-based API |       | string-based API | (.Raw)
 ------------------         ------------------

Step 1: Initially this string-based API is dependent on the Newtonsoft implementation (along with the Newtonsoft-based API),

 ------------------         ------------------
| JToken-based API |       | string-based API | (.Raw)
 ------------------         ------------------
               |               |
               V               V
           ------------------------------
          |  JToken-based implementation |
           ------------------------------

Step 2: but the whole point of establishing a string-based API was to then create a string-based implementation, and then finally to make the Newtonsoft-based API dependent on the string-based API and implementation.

 ------------------         ------------------
| JToken-based API | ----->| string-based API | (.Raw)
 ------------------         ------------------
                               |
                               V
           ------------------------------
          |  POCO-based implementation   |
           ------------------------------

Step 3: In this way we can finally extract the Newtonsoft-based API into a separate (optional) package depending the now dependency-free one.

---------------- Newtonsoft.json-ld.net -----------------------------

            ------------------
           | JToken-based API |
            ------------------
                     |
                     V
--------------- json-ld.net ------------------------------------
            ------------------
           | string-based API | (.Raw)
            ------------------
                     |
                     V
           ------------------------------
          |  POCO-based implementation   |
           ------------------------------

Once this is accomplished, we should then be free to refactor the internal implementation to be pluggable for other engines (as you mentioned). One way to do that could be by creating an entirely abstract skeleton to be the core of the product.

The problem with this refactoring approach is the necessary creation of the .Raw namespace as I do not see any other way to introduce a dependency-free API into the library. Do you have a good alternative?

@asbjornu
Copy link
Member

asbjornu commented Sep 7, 2020

What I'm thinking is that objects such as Context shouldn't be dealing with JToken directly, we need our own intermediary object model that in effect does much of what JToken does now, but which is independent of any external references. This dependency-free object model should then be possible to construct through any serializer and also from any serialization format supported by Linked Data (not limited to JSON).

@goofballLogic
Copy link
Member Author

intermediary

Yes that is very much the intention here. First step is to provide a non-JToken user API, then extract JToken from the logic itself

@asbjornu
Copy link
Member

asbjornu commented Sep 8, 2020

Ok. So this intermediary object model is going to be System.String in step 1 (this PR) and then something else in the future?

@goofballLogic
Copy link
Member Author

goofballLogic commented Sep 9, 2020

Ok. So this intermediary object model is going to be System.String in step 1 (this PR) and then something else in the future?

My thought was that the interface (API) of the library would be string based. Internally we could use a proper object model (e.g. Dictionaries produced by https://github.com/zanders3/json).

Have changed the text diagrams above to say "POCO-based implementation" where before it read "string-based implementation"

In "Step 1" (now labelled in the diagram above), the internal object model would still be JToken. "Step 2" we swap out the JToken implementation for a POCO implementation, "Step 3" we move the JToken API out to another package.

@asbjornu
Copy link
Member

Don't you think consumers of this library would appreciate having a fully typed Context object they can interact with instead of System.String?

@goofballLogic
Copy link
Member Author

goofballLogic commented Sep 18, 2020

Don't you think consumers of this library would appreciate having a fully typed Context object they can interact with instead of System.String?

I suspect that some would. What I was looking for was the lowest-common denominator to start from, then layer such things (typed input, Newtonsoft objects, System.Json objects on top). There are certainly use cases where I want to receive a raw string of JSON (e.g. sent to an API) and expand it before doing anything further.

To be honest, at this very early stage of refactoring, my main design goal was the simplest way to a dependency free API. Strings seemed like a good starting point, which we could refine later. You may be right that an object model should serve as input to the raw API interface.

@asbjornu
Copy link
Member

I agree that it should be possible to get to the original String, but I would prefer it if that was simply available with Context.ToString().

@goofballLogic
Copy link
Member Author

goofballLogic commented Sep 23, 2020

Ok, is it that you want the outward facing API to take an object model for the whole graph being passed in? Or are you simply wanting the option to pass in a typed @context ?

The pro/con with a opinionated type for @context is that will specifically refuse to accept aspects of the syntax which the library does not understand.

For the document other than the @context, are you happy with a List<Dictionary<string, object>> ? Or do you think it should be tighter?

@asbjornu
Copy link
Member

Yes, something like List<Dictionary<string, object>> is fine as a bare minimum abstraction. Where stricter typing would be useful is where we want to reflect the object model of the version of the JSON-LD specification we want to support. I think that most of the keywords in the specification should have a corresponding class in our codebase, for instance. For Context it could look something like this:

public class Context : IList<Dictionary<string, object>>
{
    private readonly string json;

    public override string ToString()
    {
        return this.json;
    }
}

@goofballLogic
Copy link
Member Author

Ok I'll use that as a guide for the internal representation, but I still wonder whether accepting a string of JSON might be a less opinionated way to provide the basic external API - although we could provide a mapping layer I suppose.

@goofballLogic
Copy link
Member Author

Is there any downside to not being able to process JSON-LD documents which contain keywords which we don't yet recognise? I'm not sure what the rules are for unrecognised property names starting with an @ symbol.

@asbjornu
Copy link
Member

asbjornu commented Sep 23, 2020

As the matter of what's fed into the outer layer of the application, it will of course be a String (and possibly Stream which is then read to a String). That String will be parsed into the internal data model of the JSON parser. The JSON data model is then mapped to the public object model of JSON-LD.NET before being returned to the consumer that fed in the initial String.

As an architectural overview, this is how it would look like:

JSON-LD - Page 1

If we detail the process, it looks like this:

JSON-LD - Page 2

@asbjornu
Copy link
Member

Is there any downside to not being able to process JSON-LD documents which contain keywords which we don't yet recognise? I'm not sure what the rules are for unrecognised property names starting with an @ symbol.

As long as we expose everything "loosely typed" in a dictionary, I think we're good.

@goofballLogic
Copy link
Member Author

in "Infrastructure" above, do you propose that lies outside of this library? (and also what is your diagramming tool :) )

@asbjornu
Copy link
Member

No, I'm thinking one project for "Core" and one or more projects defined as "Infrastructure", all in the same solution.

@goofballLogic
Copy link
Member Author

I'm not particular about .sln solutions. I think what's critical is the package structure for nuget (i.e. do people have to install all the Infrastructure stuff, or can they pick and choose).

@goofballLogic
Copy link
Member Author

Perhaps to start moving us in the right direction, I'll move the string-based API into a distinct project (but, for now, within the same solution) with the intent that the Newtownsoft one will move out to its own project once we're happy with the overall direction (and provide users with a smooth upgrade path)

@asbjornu
Copy link
Member

Agreed! Sounds good. 👍

<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp2.1</TargetFramework>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer netstandard, but for now not wanting to change the core test project which is netcoreapp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's have netstandard as a goal, but we'll get there when we'll get there.

@@ -1,7 +1,7 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15
VisualStudioVersion = 15.0.28010.2026
# Visual Studio Version 16
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hate sln files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who doesn't. 😄

@goofballLogic
Copy link
Member Author

goofballLogic commented Sep 23, 2020

I've just noticed that we (very worryingly) have both JsonLdProcessor but also JsonLdApi which are both public classes with publicly exposed methods. the JsonLdApi ones have very few direct unit tests which makes it rather fragile to my mind.

A few of our unit tests are against JsonLdApi (e.g. https://github.com/linked-data-dotnet/json-ld.net/blob/master/test/json-ld.net.tests/ExtendedFunctionalityTests.cs#L107) whereas most are against JsonLdProcessor (e.g. https://github.com/linked-data-dotnet/json-ld.net/blob/master/test/json-ld.net.tests/ConformanceTests.cs#L158 et al).

Wondering if I should park this refactoring for a while and instead go make the tests a bit less of a shambles.

@goofballLogic
Copy link
Member Author

goofballLogic commented Sep 23, 2020

Now that I look at it, we have a simply massive public API to support. These are currently ALL public members (yes even those fields). Is this even supportable going forward?

image

@goofballLogic
Copy link
Member Author

I suppose we could start by deciding to deprecate a large amount of this surface area?

@asbjornu
Copy link
Member

I suppose we could start by deciding to deprecate a large amount of this surface area?

👍 to that.

@goofballLogic
Copy link
Member Author

As the matter of what's fed into the outer layer of the application, it will of course be a String (and possibly Stream which is then read to a String). That String will be parsed into the internal data model of the JSON parser. The JSON data model is then mapped to the public object model of JSON-LD.NET before being returned to the consumer that fed in the initial String.

As an architectural overview, this is how it would look like:

JSON-LD - Page 1

If we detail the process, it looks like this:

JSON-LD - Page 2

Can I ask what you use for your fab diagrams @asbjornu ?

@asbjornu
Copy link
Member

asbjornu commented Oct 9, 2020

Can I ask what you use for your fab diagrams @asbjornu ?

Thanks for thinking they are fab! 😊 I use Lucidchart.

@goofballLogic
Copy link
Member Author

Can I ask what you use for your fab diagrams @asbjornu ?

Thanks for thinking they are fab! 😊 I use Lucidchart.

ah i use that too but usually for flowcharts. Will have to look at it a bit more. Nice palette etc.

@stale
Copy link

stale bot commented Jan 8, 2021

This issue has been automatically marked as stale because it has not had recent activity. After 30 days from now, it will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 8, 2021
@stale stale bot closed this Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support dependency-free API
2 participants