Skip to content

FakeStream implementation#435

Merged
snuyanzin merged 6 commits into
datafaker-net:mainfrom
NarekDW:feature/fake-stream
Oct 17, 2022
Merged

FakeStream implementation#435
snuyanzin merged 6 commits into
datafaker-net:mainfrom
NarekDW:feature/fake-stream

Conversation

@NarekDW

@NarekDW NarekDW commented Oct 14, 2022

Copy link
Copy Markdown
Contributor

This is the FakeStream simple implementation.
But I have a few doubts here:

  1. Is it worth to move FakeStream and FakeCollection classes to separate folder (like util) ?
  2. For now it doesn't support operations like
    Format.toCsv(FakeStream.Builder...) and Format.toJson(FakeStream.Builder...)
    as it (Csv and Json classes in particular) was designed for FakeCollection only and it will take much effort (and much redundant and similar code) to add such an ability. What do you think, how could we improve the code and avoid of writing a lot of similar code? Probably to add some new level of abstraction? Or adding some converter function, which could convert FakeStream to FakeCollection and the pass collection to the functions like Format.toCsv(FakeStream.Builder...) and Format.toJson(FakeStream.Builder...) ?
  3. Also, do we need a common interface or abstract class for FakeCollection and FakeStream, as them contain quite similar API? From my point of view, here we don't need common interface, as in fact them belong to different subjects...

@codecov-commenter

codecov-commenter commented Oct 14, 2022

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 86.44068% with 16 lines in your changes missing coverage. Please review.

Project coverage is 93.04%. Comparing base (5f7318f) to head (c5e9e51).
Report is 1140 commits behind head on main.

Files with missing lines Patch % Lines
...in/java/net/datafaker/sequence/FakeCollection.java 81.81% 3 Missing and 1 partial ⚠️
...main/java/net/datafaker/sequence/FakeSequence.java 90.24% 3 Missing and 1 partial ⚠️
...c/main/java/net/datafaker/sequence/FakeStream.java 80.00% 3 Missing and 1 partial ⚠️
src/main/java/net/datafaker/formats/Json.java 85.71% 1 Missing and 1 partial ⚠️
src/main/java/net/datafaker/formats/Csv.java 93.75% 0 Missing and 1 partial ⚠️
...n/java/net/datafaker/providers/base/BaseFaker.java 66.66% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #435      +/-   ##
============================================
- Coverage     93.07%   93.04%   -0.04%     
- Complexity     2067     2084      +17     
============================================
  Files           229      231       +2     
  Lines          4263     4315      +52     
  Branches        432      441       +9     
============================================
+ Hits           3968     4015      +47     
- Misses          183      185       +2     
- Partials        112      115       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bodiam

bodiam commented Oct 14, 2022

Copy link
Copy Markdown
Contributor

@snuyanzin can you help in answering some of the above? I'm not sure far you got with your template refactoring.

@snuyanzin

Copy link
Copy Markdown
Collaborator

Hi @NarekDW great to see here and thanks for the contribution.
Agree about common classes.
First I would say we could completely reuse Builder and would say replace build() methods with collection() to build FakeCollection and stream() to build FakeStream... In theory for backward compatibility build() method could be still kept and marked as deprecated.
What do you think?

I will try to have a deeper look later

@NarekDW

NarekDW commented Oct 14, 2022

Copy link
Copy Markdown
Contributor Author

Hi @snuyanzin glad to see you 😊
Thank you for the response :)
As I understood, you want to combine FakeCollection and FakeStream into one class and reuse common Builder, right?
If so, then I see some issues here:

  1. Collection and Stream have different nature, Stream is not Collection and vice versa.
  2. At first glance I see a number of challenges in implementation, for example:
    2.1 initialization issue.
    FakeCollection - is always a fixed sized data structure (with default value = 10).
    FakeStream - could be an infinite size data structure.
    2.2 Issue to distinguish which data structure was created by the client (Collection or Stream)...

Correct me if I’m wrong please.
Don't you think that it would be better to keep them (FakeCollection and FakeStream) separately?

@snuyanzin

snuyanzin commented Oct 15, 2022

Copy link
Copy Markdown
Collaborator

Not really. FakeCollection and FakeStream could be in separate classes
At the same time as you suggested above there could be an interface e.g. Sequence (probably there could be better name).
Both FakeCollection and FakeStream could implement it.
At the same time both builders are the same, so it could be extracted to this Sequence interface or even separately.

Initialization issue could be easily workarounded e.g.

    public static class Builder<T> {
        private final List<Supplier<T>> suppliers;
        private int minLength = -1; // negative means same as maxLength
        private int maxLength = -1; 
        private double nullRate = 0d;
        private BaseProviders faker;
    ...

    public FakeStream<T> buildStream() {
      ...
    }

    public FakeCollection<T> buildCollection() {
        maxLength = maxLength == -1 ? 10 : maxLength;
        ...
     }
}

2.2 Issue to distinguish which data structure was created by the client (Collection or Stream)...

Didn't get this problem. In Faker there could be 2 different methods collection() and stream(). So depends on which was invoked

@NarekDW

NarekDW commented Oct 16, 2022

Copy link
Copy Markdown
Contributor Author

@snuyanzin Ok, got it. Let me try to apply that interface.
As for me, Sequence - is a good name for common interface.

…hem to util folder + add CSV and Json creation support from FakeSequence
@NarekDW NarekDW force-pushed the feature/fake-stream branch from 697545a to bc796dc Compare October 16, 2022 21:01
# Conflicts:
#	src/main/java/net/datafaker/providers/base/BaseFaker.java
#	src/test/java/net/datafaker/fileformats/CsvTest.java
#	src/test/java/net/datafaker/util/FakeCollectionTest.java
@NarekDW

NarekDW commented Oct 16, 2022

Copy link
Copy Markdown
Contributor Author

@snuyanzin

  1. I've added an abstract class FakeSequence, which is common class for FakeCollection and FakeStream classes, with common methods/API/Builder (in the child classes we just overwrite the build method).
  2. Also, I changed CSV and Json creation method to support that common (FakeSequence) class, instead of using FakeCollection only. Added support to generate csv and json from finite/infinite stream.
  3. I moved that classes to the separate folder - util. Is it fine from backward compatibility point of view?
  4. I don't like some classes names like CsvCollectionBasedBuilder and JsonFromCollectionBuilder + JsonForFakeCollection. It could be better to name them *Sequence*, but I didn't change them as they are public classes and the change could lead to some issues with client code (regarding backward compatibility)...

Could you review please?
Is this the same way which you mentioned in your previous comment?

Comment thread src/main/java/net/datafaker/formats/Json.java Outdated
@snuyanzin

snuyanzin commented Oct 17, 2022

Copy link
Copy Markdown
Collaborator

thanks, in general it looks good
there are a few cosmetic comments
1.

I moved that classes to the separate folder - util. Is it fine from backward compatibility point of view?

It's better to have some more self-explanatory names e.g. sequence rather than util
2. Currently there are 2 classes CsvTest within this PR, the one from fileformats package should be removed together with fileformats package

@NarekDW

NarekDW commented Oct 17, 2022

Copy link
Copy Markdown
Contributor Author

Done. Changed the folder name from util to sequence and changed the error message in case of generating csv or json file from infinite sequence.

public abstract FakeSequence<T> build();

/**
* @param <S> [Stream<T>, List<T>]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is a reason of ci failure, also fails locally with mvn clean install
@NarekDW fyi

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, my bad, fixed

@snuyanzin snuyanzin merged commit 7c580d3 into datafaker-net:main Oct 17, 2022
@snuyanzin

Copy link
Copy Markdown
Collaborator

Merged
thanks for the contribution @NarekDW

@bodiam

bodiam commented Oct 17, 2022

Copy link
Copy Markdown
Contributor

@NarekDW This is a really great contribution! Love it! Just wondering: would it be possible for you to add something to our documentation, the markdown files I mean? If you don't have time, no problem, I'm happy to port some of your unit tests to the public website (I saw you added something to the README already) and publish it here or so:

https://www.datafaker.net/documentation/collections/

@NarekDW

NarekDW commented Oct 17, 2022

Copy link
Copy Markdown
Contributor Author

@bodiam Yep, I can do that, but I'm not able to edit the page, which you shared (when I click on the edit button I get 404 not found).
Do you have any guide how to edit the documentation?

@snuyanzin

Copy link
Copy Markdown
Collaborator

this page is a result of generation based on input here https://github.com/datafaker-net/datafaker/blob/main/docs/documentation/collections.md
so you can just edit this file

@NarekDW

NarekDW commented Oct 17, 2022

Copy link
Copy Markdown
Contributor Author

Thanks, let me take a look.

@bodiam

bodiam commented Oct 17, 2022

Copy link
Copy Markdown
Contributor

Thanks!

And sorry about that 404, I've probably configured something wrong, thanks for reporting!

@NarekDW

NarekDW commented Oct 17, 2022

Copy link
Copy Markdown
Contributor Author

Please review - #443
Is that autogeneration (of docs) a part of CI?
How can I check the result? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants