Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/index management #33

Merged
merged 9 commits into from
Aug 20, 2020
Merged

feature/index management #33

merged 9 commits into from
Aug 20, 2020

Conversation

inzanez
Copy link
Contributor

@inzanez inzanez commented Aug 17, 2020

Initial implementation suggestion for index management on ArangoDB.

src/index.rs Outdated
}

impl Index {
fn default(r#type: Type) -> Self {
Copy link
Owner

@fMeow fMeow Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer implementing Default trait in this case.

impl Default for Index{
    fn default()->Self{
            Index {
            id: None,
            is_newly_created: None,
            selectivity_estimate: None,
            error: None,
            code: None,
            fields: vec![],
            name: None,
            unique: None,
            r#type,
            sparse: None,
            min_length: None,
            deduplicate: None,
            geo_json: None,
            expire_after: None,
            in_background: Some(false),
        }
    }
} 

And I wonder, why not derive Default trait?

#[derive(Default)]
struct Index{...}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because he can't randomy pick the index type, the default is relative to the index type here (r#type argument)

@fMeow
Copy link
Owner

fMeow commented Aug 17, 2020

  1. I prefer use variable name for getter without a get prefix. get_index and get_indexes can simply be index and indexes.
  2. We don't need error and code field in response type. The helper function deserialize_response has already map the response to ArangoError if the error field is true.
  3. It seems that some index type has their own fields. For instance, a geo index only has fields while ttl index has an additional expire_after field. Can we use a fat enum to ensure that a geo index must not has a expire_after field and etc? Also, it seems that you write several methodd to initialize Index. Maybe we can use a builder pattern with the help of typed_builder?

Thank you so much for your kind contribution.

@ManevilleF
Copy link
Contributor

thank you very much for this

@inzanez
Copy link
Contributor Author

inzanez commented Aug 17, 2020

  1. I prefer use variable name for getter without a get prefix. get_index and get_indexes can simply be index and indexes.

Yes, I agree, that sounds reasonable.

  1. We don't need error and code field in response type. The helper function deserialize_response has already map the response to ArangoError if the error field is true.

I'm not quite sure if this is the same field though. I would have to check. error is a bool in this case. And I think it just informs if there's something wrong with the index. So I guess you can have a successful request, but an error on the index you requested, and the error field would show true.

  1. It seems that some index type has their own fields. For instance, a geo index only has fields while ttl index has an additional expire_after field. Can we use a fat enum to ensure that a geo index must not has a expire_after field and etc? Also, it seems that you write several methodd to initialize Index. Maybe we can use a builder pattern with the help of typed_builder?

Well, that's the big question here. I did make an implementation using an enum. That would just mean that for every index you retrieve from the database, you were required to use a match statement before you can act on the index (call any function). If that "restriction" is Ok, we may change to an enum (internally tagged by serde).

Regarding the typed_builder: I do agree that it is convenient, but from a consumer perspective I'm somehow more attached to an extra function like:

pub fn name<S: Into<String>>(mut self, name: S) -> Self {
    self.name = name.into();
    self
}

which I may use with &str as well as String. With the typed_builder on the other hand, I think I'm forced to convert to_string as a consumer of the builder for that type.

@inzanez
Copy link
Contributor Author

inzanez commented Aug 17, 2020

Ok, I switched to the implementation using enum. There's currently a BaseIndex, which has shared properties with Persistent, Hash, Skiplist and Primary. The others are implemented with independent structs.

I also managed to use typed_builder to do the Into conversion automatically.

@ManevilleF
Copy link
Contributor

@fMeow when can this be merged and included in a version?

@fMeow
Copy link
Owner

fMeow commented Aug 19, 2020

Sorry for the late response.

Sorry for the vague wording that I didn't state my opinion clearly enough. For Index, I think we can use a struct to hold fields that are shared between all kind of index, and use enum to represent those varied across indexes. The reason behind is to remove as much Option wrapper as possible. There is a time that almost all fields are wrapped in Option and I have to check if it's None for every field I read, which is terrible in my opinion.

Here is a quick but incomplete proof of concepts:

struct Index{
    // common fiedls that all kind of indexes contain 
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default)]
    pub id: String,
    ...
    // fields depends on index type,
    #[serde(flatten)]
    #[builder(default=xxxx)]
    pub index_type: IndexType
}

enum IndexType{
   GeoIndex{geo_json:bool},
   ....
}

Though I am hesitate to name the field index_type and the enum IndexType because it is not merely a label for index type but also contains data.

Thank you!!

@fMeow
Copy link
Owner

fMeow commented Aug 19, 2020

As for the error and code field, it's exactly the field I mentioned.

As the HTTP guide on read index suggests, it's used to denote whether the request succeeded.

Let me give a quick explanation. All reponse from arangoDB can be divided into two cases: a) successful request b) failed. Almost all responses contain error and code even when the request succeeded, except for document related API. If a request fails, arangoDB will give back a json like {"error":true,"code":400,"message":""}. I think the error and code is useless for a successful request. So I use the deserialize_response to check whether there is a error field in response, and cast it to a ClientError. This is done by manual deserialization of Response .

Again, please feel free to remove the error and code fields.

Thank you! :)

@inzanez
Copy link
Contributor Author

inzanez commented Aug 19, 2020

As for the error and code field, it's exactly the field I mentioned.

As the HTTP guide on read index suggests, it's used to denote whether the request succeeded.

Let me give a quick explanation. All reponse from arangoDB can be divided into two cases: a) successful request b) failed. Almost all responses contain error and code even when the request succeeded, except for document related API. If a request fails, arangoDB will give back a json like {"error":true,"code":400,"message":""}. I think the error and code is useless for a successful request. So I use the deserialize_response to check whether there is a error field in response, and cast it to a ClientError. This is done by manual deserialization of Response .

Again, please feel free to remove the error and code fields.

Thank you! :)

Thanks for this explanation, I removed the fields.

@inzanez
Copy link
Contributor Author

inzanez commented Aug 19, 2020

Sorry for the late response.

Sorry for the vague wording that I didn't state my opinion clearly enough. For Index, I think we can use a struct to hold fields that are shared between all kind of index, and use enum to represent those varied across indexes. The reason behind is to remove as much Option wrapper as possible. There is a time that almost all fields are wrapped in Option and I have to check if it's None for every field I read, which is terrible in my opinion.

Here is a quick but incomplete proof of concepts:

struct Index{
    // common fiedls that all kind of indexes contain 
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default)]
    pub id: String,
    ...
    // fields depends on index type,
    #[serde(flatten)]
    #[builder(default=xxxx)]
    pub index_type: IndexType
}

enum IndexType{
   GeoIndex{geo_json:bool},
   ....
}

Though I am hesitate to name the field index_type and the enum IndexType because it is not merely a label for index type but also contains data.

Thank you!!

Ok, I see what you mean. But I'm not sure if it's really a more ergonomic solution in the end. Please check the following example and decide for yourself what you think might be cleaner. There's certainly less code in this one, but I think I like the abstraction concept of the current implementation better. But then that's only personal preference:

#[macro_use]
extern crate typed_builder;

#[macro_use]
extern crate serde;

#[derive(Debug, Serialize, Deserialize, Default, TypedBuilder)]
struct Index{
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default)]
    pub id: String,
    #[serde(flatten)]
    #[builder(default)]
    pub index_type: IndexType
}

#[serde(rename_all = "camelCase", tag = "type")]
#[derive(Debug, Serialize, Deserialize)]
enum IndexType{
    Persistent{unique:bool, sparse: bool, deduplicate: bool},
    Hash{unique:bool, sparse: bool, deduplicate: bool},
    Geo{geo_json:bool},
}

impl Default for IndexType {
    fn default() -> Self {
        IndexType::Persistent { unique: false, sparse: false, deduplicate: false }
    }
}

fn main() {
    let x = Index::builder().fields(vec![]).name("test").build();
    let json = serde_json::to_string(&x).unwrap();
    println!("{}", json);

    let x = Index::builder().fields(vec![]).name("test").index_type(IndexType::Geo{geo_json:true}).build();
    let json = serde_json::to_string(&x).unwrap();
    println!("{}", json);

    let json = r#"{"fields":[],"name":"test","id":"","type":"persistent","unique":false,"sparse":false,"deduplicate":false}"#;
    let x: Index = serde_json::from_str(json).unwrap();

    println!("{:#?}", x);
}

Please let me know what you think.

@inzanez
Copy link
Contributor Author

inzanez commented Aug 19, 2020

I just wrote that implementation according to what you outlined above, maybe it's easier to think about it that way. I guess both would work out well.

If you prefer the current (latest) implementation, please let me finalize the documentation before merging.

@inzanez
Copy link
Contributor Author

inzanez commented Aug 19, 2020

Documentation updated.

@fMeow
Copy link
Owner

fMeow commented Aug 20, 2020

Yes, I agree the current implement that forces to use match to get any data is not ergonomic. Maybe we can combine the current and previous implementation. Is it good to you?

#[derive(Debug, Serialize, Deserialize, Default, TypedBuilder)]
struct Index{
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default)]
    pub id: String,
    #[serde(flatten)]
    #[builder(default="IndexType::Persistent(Persistent{ unique: false, sparse: false, deduplicate: false })")]
    pub index_type: IndexType
}

#[serde(rename_all = "camelCase", tag = "type")]
#[derive(Debug, Serialize, Deserialize)]
enum IndexType{
    Persistent(PersistentIndex),
    Hash(HashIndex),
    Geo(GeoIndex),
}

struct PersistentIndex{unique:bool, sparse: bool, deduplicate: bool}
struct HashIndex{unique:bool, sparse: bool, deduplicate: bool}
struct GeoIndex{geo_json:bool}

In addition, we can add some handy method to quickly differentiate between each index type.

impl IndexType{

    pub fn persistent(&self)->Option<&PersistentIndex>{
        if let IndexType::Persistent(p)=self{
            Some(p)
        }
        else {
            None
        }
    }
    pub fn is_persistent(&self)->bool{
        self.persistent().is_some()
    }
    // ........
}

Is this API sounds good to you?

Thank you so much for your contribution and quick response!!!!

@fMeow
Copy link
Owner

fMeow commented Aug 20, 2020

Judge from your example, it seem the Index.fileds can be empty. Then maybe we can use a default?

#[derive(Debug, Serialize, Deserialize, Default, TypedBuilder)]
struct Index{
    #[builder(default="Vec::new()")]
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default, setter(into))]
    pub id: String,
    #[serde(flatten)]
    #[builder(default)]
    pub index_type: IndexType
}

@inzanez
Copy link
Contributor Author

inzanez commented Aug 20, 2020

Judge from your example, it seem the Index.fileds can be empty. Then maybe we can use a default?

#[derive(Debug, Serialize, Deserialize, Default, TypedBuilder)]
struct Index{
    #[builder(default="Vec::new()")]
    pub fields: Vec<String>,
    #[builder(default, setter(into))]
    pub name: String,
    #[builder(default, setter(into))]
    pub id: String,
    #[serde(flatten)]
    #[builder(default)]
    pub index_type: IndexType
}

I just added the default value for the fields.

Regarding the implementation with enum: Using this I came to the conclusion that people most often use this index management to create indexes, usually on installation or something. And as this is pretty convenient, I wouldn't change it really. Looks fine I think.

@fMeow fMeow merged commit b2c4234 into fMeow:develop Aug 20, 2020
@fMeow
Copy link
Owner

fMeow commented Aug 20, 2020

This PR is merged now, and will be released in the next several hours.

Thank you soooo much!

@inzanez
Copy link
Contributor Author

inzanez commented Aug 20, 2020

Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants