Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix padding of names #1

Open
jexp opened this issue Jun 23, 2020 · 0 comments
Open

fix padding of names #1

jexp opened this issue Jun 23, 2020 · 0 comments
Assignees

Comments

@jexp
Copy link
Contributor

jexp commented Jun 23, 2020

Neo4j Developer Relations Team,

Thanks for making the Neo4j sandboxes available. I use the movie recommendation sandbox to introduce graph concepts to students who do not have a background in computing.

In using the sandbox,there is a minor issue with the actor and director nodes that you may be aware of, but I figured I would pass it along. Some of the actors and directors have two nodes - one with a space before their name, and some have a single node, but also have a space before their name. Possibly this is due to parsing the data on the commas and not trimming out the spaces.

For the students, I have them run a few queries to fix the issue. They may not be the most efficiently written, but are included below.

Thanks again for providing this resource!

Best,

Scott

// The following query merges actors who have two nodes,
// the one having a space at the start of their name is merged into the other.

MATCH (n:Actor)
WHERE n.name STARTS WITH " " = False
WITH n.name as aname, " "+n.name as paddedname
MATCH (n1:Actor {name: aname}), (n2:Actor {name: paddedname})
WITH [n1,n2] as ns
CALL apoc.refactor.mergeNodes(ns, {properties:'discard'}) YIELD node
RETURN node

// The following query should be run after the query above and removes
// the space at the start of actor names where there was only one node
// for that actor

MATCH (n:Actor)
WHERE n.name STARTS WITH " " = True
SET n.name = TRIM(n.name)
RETURN n

// The following query merges directors who have two nodes,
// the one having a space at the start of their name is merged into the other.

MATCH (n:Director)
WHERE n.name STARTS WITH " " = False
WITH n.name as aname, " "+n.name as paddedname
MATCH (n1:Director {name: aname}), (n2:Director {name: paddedname})
WITH [n1,n2] as ns
CALL apoc.refactor.mergeNodes(ns, {properties:'discard'}) YIELD node
RETURN node

// The following query should be run after the query above and removes
// the space at the start of director names where there was only one node
// for that director

MATCH (n:Director)
WHERE n.name STARTS WITH " " = True
SET n.name = TRIM(n.name)
RETURN n

@jexp jexp self-assigned this Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant