Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tech] simpler gtfs reading api #780

Merged
merged 8 commits into from
May 19, 2021
Merged

[tech] simpler gtfs reading api #780

merged 8 commits into from
May 19, 2021

Conversation

antoine-de
Copy link
Contributor

closes #739 and preparation adding a bit of docs.

Hide the Configuration needed to read GTFS to have easier hello world and an api closer to the ntfs module.

Also add a gtfs::read method that detect if the GTFS is a zip or not (like ntfs::read).

now the canonimal way to read a gtfs is:

let model = transit_model::gtfs::read("path to a zip or directory")?;

For better control there is also:

let model = transit_model::gtfs::read_from_path("path to a directory")?;
let model = transit_model::gtfs::read_from_zip("path to a zip")?;

or even a gtfs::from_read method that takes a reader (cf doc).

When a configuration is needed, a Reader struct needs to be instanciated

let conf = gtfs::Configuration {
            read_as_line: true,
            ..Default::default()
        };
let model = transit_model::gtfs::Reader::new(conf).from(input_dir)?;

Note: I did not use parse_from_* like it was discussed previously because I think this naming is more consistent with the NTFS.

closes #739

hide the `Configuration` needed to read GTFS to have easier hello world and an api closer to the ntfs module.

Also add a `gtfs::read` method that detect if the GTFS is a zip or not (like `ntfs::read`).

now the canonimal way to read a gtfs is:

```rust
let model = transit_model::gtfs::read("path to a zip or directory")?;
```

For better control there is also:

```rust
let model = transit_model::gtfs::read_from_path("path to a directory")?;
```
```rust
let model = transit_model::gtfs::read_from_zip("path to a zip")?;
```
or even a `gtfs::from_read` method that takes a reader (cf doc).

When a configuration is need a `Reader` struct is needed
```rust
let conf = gtfs::Configuration {
            read_as_line: true,
            ..Default::default()
        };
let model = transit_model::gtfs::Reader::new(conf).from(input_dir)?;
```
@antoine-de
Copy link
Contributor Author

damn clippy doesn't like Reader::default().from_path(.) 😢

@antoine-de
Copy link
Contributor Author

antoine-de commented May 10, 2021

I feel the from_ not so bad since they are in a gtfs::Reader struct, so I felt they were quite understandable, but clippy (and @datanel ) does not agree with me 😄

I feel we have several choices (feel free to add more if I missed some):

1 keep it this way

ignore clippy warning

2 use parse in struct as proposed

2a and don't change free functions

this way we are coherent with ntfs::read method

    reader.parse_zip_read()
    reader.parse_zip_path()
    reader.parse_dir_path()
    reader.parse()
// and
transit_model::gtfs::read()
transit_model::gtfs::from_zip()
transit_model::gtfs::from_path()
transit_model::gtfs::from_zip_read()

2b and change free functions

And should we change the ntfs free functions for coherence?

    reader.parse_zip_read()
    reader.parse_zip_path()
    reader.parse_dir_path()
    reader.parse()
// and
transit_model::gtfs::parse()
transit_model::gtfs::parse_zip_path()
transit_model::gtfs::parse_dir_path()
transit_model::gtfs::parse_zip_read()

Conclusion

I'd vote for 1. or 2a. since they are easy to implement and I feel the api is nice, but I'm open to suggestion

woshilapin
woshilapin previously approved these changes May 10, 2021
Copy link
Contributor

@woshilapin woshilapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactorization. For your question, I'd vote 2a too. One is a builder pattern and the other are free functions so I feel they can be named differently. Note that I see one weird thing: we have reader.parse_zip_read() but we have transit_model::gtfs::from_read(). Should we have transit_model::gtfs::from_zip_read()?

// let resp = reqwest::blocking::get(url)?; // or async call
// let data = std::io::Cursor::new(resp.bytes()?.to_vec());
// let model = transit_model::gtfs::from_read(data, &url, configuration)?;
/// ```ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to ignore it? Because reqwest is not part of the dependency-tree? Why didn't it fail before?

For this specific documentation example, we might be able to find another example that doesn't require to bring a big dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes reqwest is not a dependency, it cannot be build.
Before the was only 2 / so it was not build at all 😅
Do you have an example in mind? It's true that it's a shame not to build all examples, but I think it can be nice to have a real use case here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR targeting this branch to explore activation of doc test: #782
Let you take only what's fine out of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bringing reqwest will considerably increase the build time as it pull an important number of transitive dependencies. This is one of the reason reqwest was removed from transit_model in the first place. So even if what you propose does work, I'm not sure we want to go that way.

Also, it does bring a dependency to the network inside the tests, I'm not entirely sure if it's risky or not but usually, a test that does not depend on the outside is more reliable.

Copy link
Contributor

@pbougue pbougue May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, locally I didn't hit the reqwest build time (but it looks like CI doesn't have it so easy).
Added to network and to the files that would move, it's a pain indeed.

Should we add only the transit_model:: for Errors? (and the complete output line for ntfs doc-test)?

@antoine-de
Copy link
Contributor Author

ok for transit_model::gtfs::from_zip_read().

Is it ok for you if I also change the ntfs::from_read too?

@antoine-de
Copy link
Contributor Author

after some discussion the methods are:
image

and
image

(to lazy to copy it and I think the doc is easier to read)

The ntfs module have been changed accordingly:

  • ntfs::from_read -> ntfs::from_zip_reader
  • ntfs::read_from_path -> ntfs::from_dir
  • ntfs::read_from_zip -> ntfs::from_zip`

datanel
datanel previously approved these changes May 10, 2021
src/gtfs/mod.rs Outdated Show resolved Hide resolved
woshilapin
woshilapin previously approved these changes May 10, 2021
Copy link
Contributor

@woshilapin woshilapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why parse_from_zip_reader()? I thought parse_zip_reader() was fine and a bit less a mouthful.

@antoine-de
Copy link
Contributor Author

antoine-de commented May 17, 2021

isn't parse_from_zip_reader more consistent with the other function (parse_from_zip, parse_from_path)?

or we can remove all the from if you prefer (fine for me)

@antoine-de antoine-de dismissed stale reviews from woshilapin and datanel via c61bb1f May 17, 2021 08:29
@woshilapin
Copy link
Contributor

I'd be OK to remove from_ everywhere.

@antoine-de
Copy link
Contributor Author

ok, last call I hope 😅

I only changed the Reader functions:

reader.parse(..)
reader.parse_zip(..)
reader.parse_dir(..)
reader.parse_zip_reader(.., ...)

datanel
datanel previously approved these changes May 18, 2021
@pbougue pbougue self-requested a review May 18, 2021 13:08
pbougue
pbougue previously approved these changes May 18, 2021
Copy link
Contributor

@pbougue pbougue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine refactor, did not review all thoroughly but 👍 from me.

tests/read_gtfs.rs Outdated Show resolved Hide resolved
src/gtfs/mod.rs Show resolved Hide resolved
src/gtfs/mod.rs Show resolved Hide resolved
// let resp = reqwest::blocking::get(url)?; // or async call
// let data = std::io::Cursor::new(resp.bytes()?.to_vec());
// let model = transit_model::gtfs::from_read(data, &url, configuration)?;
/// ```ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR targeting this branch to explore activation of doc test: #782
Let you take only what's fine out of it.

Co-authored-by: Pierre-Etienne Bougué <pierre-etienne.bougue@kisio.com>
@antoine-de antoine-de dismissed stale reviews from pbougue and datanel via a1c89c9 May 18, 2021 13:41
pbougue
pbougue previously approved these changes May 18, 2021
Copy link
Contributor

@pbougue pbougue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, pick what you want (if anything) from my PR and close it 👍

@datanel datanel merged commit ab1ec66 into master May 19, 2021
@datanel datanel deleted the simplier_gtfs_api branch May 19, 2021 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[tech] Easier hello world
4 participants