Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data source #105

Merged
merged 12 commits into from May 28, 2020
Merged

New data source #105

merged 12 commits into from May 28, 2020

Conversation

sunli829
Copy link
Collaborator

I rewrote the code of DatsSource.
The DataSource implemented for Stream has been temporarily removed, and this modification will be added after everyone agrees.

@sunli829 sunli829 requested review from phated and IcanDivideBy0 and removed request for phated May 24, 2020 12:14
Copy link
Contributor

@IcanDivideBy0 IcanDivideBy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor remarks, otherwise looks good to me!

src/types/connection/connection_type.rs Outdated Show resolved Hide resolved
src/types/connection/cursor.rs Show resolved Hide resolved
Copy link
Contributor

@phated phated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of comments and questions.

src/base.rs Outdated
return None;
}
Some(v[1].to_string().into())
base64::decode(id.as_str())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird pattern to force people into. If you structure data properly, this isn't needed because you can do a trip to the database to find the type. You shouldn't be relying on types sent from the client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed Type::global_id and Type::from_global_id. 🙂

}

impl<C, E, T> Record<C, E, T> {
pub fn new(cursor: C, edge: E, element: T) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! 👍 I was going to implement this for my own stuff.

}

impl<C, T> Record<C, EmptyEdgeFields, T> {
pub fn new_without_edge_fields(cursor: C, element: T) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make new be without EdgeFields and the make this new_with_edge_fields. The right defaults are important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the name of my suggestion above, we could call this new_with_additional_fields.

use indexmap::map::IndexMap;
use inflector::Inflector;
use itertools::Itertools;
use std::borrow::Cow;

/// The record type output by the data source
pub struct Record<C, E, T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to map to the relay spec, I believe this should be named Edge

/// The record type output by the data source
pub struct Record<C, E, T> {
cursor: C,
edge: E,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would then be named additional_fields (or something similar)

fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
fn encode_cursor(self) -> String;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think encoding should be fallible too.

src/types/connection/cursor.rs Show resolved Hide resolved
fn to_json(&self) -> Result<serde_json::Value> {
Ok(self.0.to_string().into())
fn encode_cursor(self) -> String {
base64::encode((self as u32).to_be_bytes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fallible.

#[field(desc = "When paginating backwards, the cursor to continue.")]
pub start_cursor: Option<Cursor>,
/// When paginating backwards, the cursor to continue.
pub start_cursor: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option<impl CursorType>?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires PageInfo<C>, but it's an internal type, so String might be more convenient.

#[field(desc = "When paginating forwards, the cursor to continue.")]
pub end_cursor: Option<Cursor>,
/// When paginating forwards, the cursor to continue.
pub end_cursor: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option<impl CursorType>?

@sunli829 sunli829 marked this pull request as draft May 24, 2020 23:54
Copy link
Contributor

@IcanDivideBy0 IcanDivideBy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bit of renaming to makes things clearer

src/types/connection/connection.rs Outdated Show resolved Hide resolved
src/types/connection/edge.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@phated phated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really looking good. Other than the naming comments from @IcanDivideBy0, I'm giving 👍

Should we make EdgeFieldsType an Option to remove EmptyEdgeFields from the API? Now that we have an API for new_with_additional_fields, users wouldn't need to operate on the Some/None.

@sunli829
Copy link
Collaborator Author

This is really looking good. Other than the naming comments from @IcanDivideBy0, I'm giving 👍

Should we make EdgeFieldsType an Option to remove EmptyEdgeFields from the API? Now that we have an API for new_with_additional_fields, users wouldn't need to operate on the Some/None.

This may not be removable, maybe I have not found a way. 😀

@phated
Copy link
Contributor

phated commented May 26, 2020

@sunli829 do you want me to try making EdgeFieldsType an Option?

@sunli829
Copy link
Collaborator Author

@sunli829 do you want me to try making EdgeFieldsType an Option?

Yes, thank you, and I want to remove the totalCount on the Connection and add additional fields to the Connection. 😁

@sunli829
Copy link
Collaborator Author

@sunli829 do you want me to try making EdgeFieldsType an Option?

Yes, thank you, and I want to remove the totalCount on the Connection and add additional fields to the Connection. 😁

I am going to solve this problem now. There are potential security issues here, and I found a way to avoid using unsafe code.

// The following code I think is safe.😁

@phated
Copy link
Contributor

phated commented May 26, 2020

👍 I will let you work on it! 🍻

@sunli829
Copy link
Collaborator Author

Is it impossible to make EdgeFieldsType into Option? @phated 😁

@phated
Copy link
Contributor

phated commented May 27, 2020

I didn't work on it because I thought you were trying. Would you like me to work on it?

@sunli829
Copy link
Collaborator Author

I didn't work on it because I thought you were trying. Would you like me to work on it?

Yes. 😅

@sunli829
Copy link
Collaborator Author

I don't know how to do this. Maybe you can help me with it. 😁

@sunli829
Copy link
Collaborator Author

We may need to remove the generic parameter E on Connection to do this.

@phated
Copy link
Contributor

phated commented May 27, 2020

@sunli829 I don't see an easy way to support this without a lot of refactor.

@sunli829
Copy link
Collaborator Author

@sunli829 I don't see an easy way to support this without a lot of refactor.

Only trait objects can be used, which will reduce performance and feel unnecessary.

@phated
Copy link
Contributor

phated commented May 27, 2020

I honestly don't understand the benefit of these extra fields, but I just want this feature finished so I can use it.

src/types/connection/edge.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@phated phated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunli829 this looks GREAT! Let's ship it! :shipit:

@phated phated marked this pull request as ready for review May 27, 2020 16:49
@sunli829 sunli829 merged commit ada2597 into master May 28, 2020
@phated
Copy link
Contributor

phated commented May 28, 2020

@sunli829 I added this to my project locally and it is very cool! I'm excited to see @IcanDivideBy0 re-introduce his Stream support

@sunli829
Copy link
Collaborator Author

@sunli829 I added this to my project locally and it is very cool! I'm excited to see @IcanDivideBy0 re-introduce his Stream support

The current implementation does not conflict with StreamDataSource and can be easily added again. @IcanDivideBy0

@sunli829 sunli829 deleted the new-datasource branch May 28, 2020 04:14
@IcanDivideBy0
Copy link
Contributor

@phated @sunli829 I'll take a look a this today!

@mwilliammyers
Copy link
Contributor

mwilliammyers commented May 28, 2020

I don’t know how I didn’t catch this before but according to the Rust guidelines for APIs, it should be with_additional_fields not new_with_additional_fields. I know it’s small but, @sunli829, you said you appreciate clean APIs 🙃

Is it worth changing those?

@phated
Copy link
Contributor

phated commented May 28, 2020

@mwilliammyers seems good. Want to open an issue? I think we can keep both around for awhile since it doesn't hurt anything.

@sunli829
Copy link
Collaborator Author

sunli829 commented May 29, 2020

At the beginning, with_additional_fields was not provided because it usually seems to be an option, but in fact it is required.

async fn execute_query(
            &self,
            _ctx: &Context<'_>,
            after: Option<usize>,
            before: Option<usize>,
            first: Option<usize>,
            last: Option<usize>,
        ) -> FieldResult<
            Connection<
                usize, // cursor
                u32, // node
                MyConnectionFields,
                EmptyFields,
            >,
        >
{
    Connection::new(...). // Connection<usize, u32, EmptyFields, EmptyFields>
          with_additional_fields(MyConnectionFields{ ... }); // Connection<usize, u32, MyConnectionFields, EmptyFields>
}

If with_additional_fields is missing, it will fail to compile.

@sunli829
Copy link
Collaborator Author

Do you just want to change the name of this function?

@phated
Copy link
Contributor

phated commented May 29, 2020

@sunli829 yes it is just to change the name because of Rust guideline recommendation.

new_with_additional_fields ➡️ with_additional_fields

@IcanDivideBy0
Copy link
Contributor

Ok, now that I've spend some hours trying to get thing working for Stream, I'll share here where I've come...

The new CursorType is not really easy to work with, mostly because of the Sync requirement. I've managed to almost completely remove the Sync everywhere from Schema, Resolver, ObjectType impls etc but got stuck at some point with Data. I still think this can be achieved but it's getting too far beyond my comprehension of the crate as a whole :/
(If someone can point me why we cannot remove Sync from almost the whole API, that'd be enlightening)

I think the most easy way would be to remove the CursorType trait which don't really add any value to async_graphql, forcing it as String everywhere it's used and let users encode / decode their Cursor in the execute_query function since it already return a Result. Also It feels a bit strange to return a Connection with the cursor type set to ID and getting a String in the schema for example.

That being said, one change that will be required anyways to implement DataSource for Stream is to change query and execute_query signatues to &mut self

@sunli829
Copy link
Collaborator Author

Let me think for a moment. 😀

@sunli829
Copy link
Collaborator Author

Can you tell me where you use stream datasource?

@IcanDivideBy0
Copy link
Contributor

IcanDivideBy0 commented May 29, 2020

I was trying to bring back

impl<C, T> DataSource for Pin<Box<dyn Stream<(C, T)>>>
where
    C: CursorType,
    T: OutputValueType

@sunli829
Copy link
Collaborator Author

Do you give the results of the MongoDB query to Stream DataSource?

@sunli829
Copy link
Collaborator Author

Ok, now that I've spend some hours trying to get thing working for Stream, I'll share here where I've come...

The new CursorType is not really easy to work with, mostly because of the Sync requirement. I've managed to almost completely remove the Sync everywhere from Schema, Resolver, ObjectType impls etc but got stuck at some point with Data. I still think this can be achieved but it's getting too far beyond my comprehension of the crate as a whole :/
(If someone can point me why we cannot remove Sync from almost the whole API, that'd be enlightening)

I think the most easy way would be to remove the CursorType trait which don't really add any value to async_graphql, forcing it as String everywhere it's used and let users encode / decode their Cursor in the execute_query function since it already return a Result. Also It feels a bit strange to return a Connection with the cursor type set to ID and getting a String in the schema for example.

That being said, one change that will be required anyways to implement DataSource for Stream is to change query and execute_query signatues to &mut self

A few days ago I also wanted to remove Sync, but I failed. 😂

@phated
Copy link
Contributor

phated commented May 29, 2020

I also made my cursor a string. Should we change that?

@sunli829
Copy link
Collaborator Author

I also made my cursor a string. Should we change that?

I think this is not a problem. In most cases, the cursor should be a base64 encoded string.

@sunli829
Copy link
Collaborator Author

sunli829 commented May 29, 2020

In fact, I even want to delete these two.

impl CursorType for String
impl CursorType for ID

I think it is up to users to implement Cursor, not us. 😂

@phated
Copy link
Contributor

phated commented May 29, 2020

@sunli829 we need to delete the cursor type. That's what is causing the Sync problems. Please make it only a String

@sunli829
Copy link
Collaborator Author

Cursor is already implemented for the String type. What is the difference between removing Cursor and replacing it with a String?

@phated
Copy link
Contributor

phated commented May 29, 2020

I believe @IcanDivideBy0 already explained why it doesn't work but maybe we need to see his code?

@IcanDivideBy0
Copy link
Contributor

Ohhh my bad, I don't know why but I was absolutely convinced that a stream of Sync items have to be Sync... So I went straight up in this direction, trying to remov the Sync requirement on CursorType

So, in the end, this compile fine:

use crate::connection::{Connection, CursorType, DataSource, EmptyFields};
use crate::{Context, FieldResult, OutputValueType};
use async_graphql_derive::DataSource;
use futures::Stream;
use std::pin::Pin;

#[DataSource(internal)]
impl<'a, C, T> DataSource for Pin<Box<dyn Stream<Item = (C, T)> + Send + 'a>> // Not Sync !
where
    C: CursorType + Send + Sync + 'a,
    T: OutputValueType + Send + Sync + 'a,
{
    type CursorType = C;
    type NodeType = T;
    type ConnectionFieldsType = EmptyFields;
    type EdgeFieldsType = EmptyFields;

    async fn execute_query(
        &mut self,
        _ctx: &Context<'_>,
        after: Option<Self::CursorType>,
        before: Option<Self::CursorType>,
        first: Option<usize>,
        last: Option<usize>,
    ) -> FieldResult<
        Connection<
            Self::CursorType,
            Self::NodeType,
            Self::ConnectionFieldsType,
            Self::EdgeFieldsType,
        >,
    > {
        unimplemented!()
    }
}

Still a few gotchas tho

  1. We need to get get a mutable ref to the stream to pull nodes out of it
  2. Previously my first aim was to reduce memory usage by not having to pull a large stream into a vec to slice it right after completion. Since performances were pretty bad on previous implementation, I wonder if we really should provide this implementation in the crate.

@phated
Copy link
Contributor

phated commented May 29, 2020

I'm having a hard time conceptualizing how my database stream will be used with the Stream datasource since my return type is a GraphQL Union. The &self reference has been hard for me because I can't implement DataSource against a Rust enum, which we require for the Union macro—which required me to implement a dummy struct just to have a DataSource.

I was hoping to remove that &self reference completely and store data in the Context, but maybe that's the wrong way to think about this?

As for performance, I say we implement it with "bad" performance and then improve the performance once we can start using it in real scenarios. What do you think?

@sunli829
Copy link
Collaborator Author

sunli829 commented May 29, 2020

  1. I think the datasource should be stored in the context and not created at query time, that's why it's Sync and &self.

  2. Query all the data from the database and then filter out most of the records through paging in the stream data source, if my understanding of the stream data source is correct. Then I would never use it, why not use a database query to filter out these records?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants