Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support label creation via property values #1762

Merged
merged 32 commits into from Mar 12, 2024

Conversation

DavIvek
Copy link
Contributor

@DavIvek DavIvek commented Feb 26, 2024

Idea of this PR is to support queries that do label creation via property values.
For example:

LOAD CSV FROM "/usr/lib/memgraph/people.csv" WITH HEADER AS row
CREATE (n:row.label  {id: row.id, name: row.name, age: toInteger(row.age), city: row.city});

This won't work for MERGE or MATCH clauses. Problem with these clauses is that we don't know what label we are using while we create a plan (since the evaluation phase comes after the planning phase ), hence we can't rewrite query plans to use indexes instead of scanning through all nodes. A possible approach could be to always use the ScanAll operator but that would be inefficient. For the same reason Memgraph doesn't support using property maps as parameters in MATCH/MERGE clauses.
SET and REMOVE clauses should also work.

[master < Task] PR

  • Check, and update documentation if necessary
  • Provide the full content or a guide for the final git message

To keep docs changelog up to date, one more thing to do:

  • Write a release note here, including added/changed clauses
  • Tag someone from docs team in the comments

Release note:
User can create, set, or remove labels using property values, here is an example:

LOAD CSV FROM "/usr/lib/memgraph/people.csv" WITH HEADER AS row
CREATE (n:row.label  {id: row.id, name: row.name, age: toInteger(row.age), city: row.city});

closes #1509

@DavIvek DavIvek self-assigned this Feb 26, 2024
@DavIvek DavIvek added the Docs needed Docs needed label Feb 26, 2024
@DavIvek DavIvek changed the title Support label manipulation via variables Support label creation via variables Feb 27, 2024
@DavIvek DavIvek marked this pull request as ready for review March 1, 2024 16:00
Copy link
Contributor

@andrejtonev andrejtonev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.
Still have to get to the end if the PR.

src/query/frontend/ast/cypher_main_visitor.cpp Outdated Show resolved Hide resolved
src/query/frontend/ast/cypher_main_visitor.cpp Outdated Show resolved Hide resolved
src/query/frontend/ast/cypher_main_visitor.cpp Outdated Show resolved Hide resolved
src/query/frontend/ast/ast.hpp Outdated Show resolved Hide resolved
@DavIvek DavIvek added the feature feature label Mar 7, 2024
Copy link
Collaborator

@antoniofilipovic antoniofilipovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work 💪

@DavIvek DavIvek enabled auto-merge (squash) March 12, 2024 11:10
@DavIvek DavIvek merged commit de2e204 into master Mar 12, 2024
6 checks passed
@DavIvek DavIvek deleted the support-label-manipulation-via-variables branch March 12, 2024 12:55
@gitbuda gitbuda added this to the mg-v2.16.0 milestone Mar 12, 2024
@kgolubic
Copy link
Contributor

kgolubic commented Apr 7, 2024

@DavIvek can you please provide me with docs PR and release note?

@DavIvek
Copy link
Contributor Author

DavIvek commented Apr 7, 2024

@kgolubic tomorrow 'll write docs and release notes for PRs I opened for 2.16 release

@DavIvek
Copy link
Contributor Author

DavIvek commented Apr 7, 2024

@kgolubic I'm currently considering where to include this information in the documentation, but I'm unsure of the best location. Initially, I looked into the Expressions section, but it doesn't seem to be a perfect fit there since currently we don't have a section for this type of use cases. Another possibility is within the CREATE clause documentation, but we currently only cover the basic rules for using the CREATE clause (for example, we don't detail how to create properties based on other property values). Including this additional information could significantly expand the pages dedicated to basic clause instructions, which may not be ideal. Perhaps we can include this information as a note within the LOAD CSV documentation, since the primary motivation behind adding this functionality was to simplify the process of importing data from CSV files.
I'm open to suggestions on how to handle this effectively.

@kgolubic
Copy link
Contributor

kgolubic commented Apr 8, 2024

Perhaps we can include this information as a note within the LOAD CSV documentation, since the primary motivation behind adding this functionality was to simplify the process of importing data from CSV files.
I'm open to suggestions on how to handle this effectively.

@DavIvek I would say this is the way to go. Can you take this over?

@katarinasupe
Copy link
Contributor

@DavIvek are there any plans to implement the same for the edge type?

@DavIvek
Copy link
Contributor Author

DavIvek commented Apr 9, 2024

@katarinasupe At the moment no, but it could be implemented in the future, did anyone asked for that?

@antejavor
Copy link
Contributor

This does not work for me on the release candidate, I did get the following output:

memgraph> LOAD CSV FROM "/usr/lib/memgraph/people.csv" WITH HEADER AS row
       -> CREATE (n:row.label  {id: row.id, name: row.name, age: toInteger(row.age), city: row.city});
Client received query exception: TypedValue is of type 'null', not 'string'

cc @matea16 @katarinasupe

@antejavor
Copy link
Contributor

Sorry @DavIvek, ignore my comment from above, so the following works.

The issue was space before the label in the dataset :

id,name,age,city, label
100,Daniel,30,London, PersonA
101,Alex,15,Paris, PersonB
102,Sarah,17,London, PersonA
103,Mia,25,Zagreb, PersonB
104,Lucy,21,Paris, PersonA

We should maybe improve the messaging here @DavIvek? It was not clear to me what the issue is from the message:
Client received query exception: TypedValue is of type 'null', not 'string'

@DavIvek
Copy link
Contributor Author

DavIvek commented Apr 9, 2024

@antejavor I believe you'll encounter the same error message if you attempt to create a label with a null value or an integer, for instance. This error message isn't specific to this particular scenario, but I do agree that the messaging could be improved in these cases.
Should we create an issue for that?

@DavIvek
Copy link
Contributor Author

DavIvek commented Apr 9, 2024

My mistake, you won't encounter the same issue if you try to create a label with an integer; this will result in an error during the parsing phase. However, you will receive the same error message if you use parameters and provide a null or integer value (as the label must be a string type). This error will be detected during the evaluation phase when attempting to resolve the variable or parameter into a node label.
Still, this error messaging isn't directly related to the use case of this pull request, but it certainly could be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LOAD CSV does not support parameters for Labels
7 participants