Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for nulls that aren't null and fix them with an actual null #70

Open
steveoh opened this issue Aug 3, 2021 · 9 comments
Open
Labels
help wanted Extra attention is needed sweeper new sweeper ideas

Comments

@steveoh
Copy link
Member

steveoh commented Aug 3, 2021

Create a sweeper that replaces

  • "" empty string -> null
  • " " empty spaces -> null
  • "null" null string (ignoring case) -> null
  • "" null string with brackets (ignoring case) -> null

What other values does the data team find in real world data?

empty, -, ...?

@steveoh steveoh added help wanted Extra attention is needed sweeper new sweeper ideas labels Aug 3, 2021
@stdavis
Copy link
Member

stdavis commented Aug 3, 2021

I can see translating the null strings to native null values, but converting empty strings to be null seems like it may be a bad idea. I feel like there might be cases where having both empty strings and null values are important.

@steveoh
Copy link
Member Author

steveoh commented Aug 3, 2021

I struggle to see the case where an empty string is important. Would it be web map labeling or a where clause?

@ZachBeck
Copy link
Member

ZachBeck commented Aug 4, 2021

I think @gregbunce has code that can't handle nulls in certain fields in the address points. I imagine the code could be changed though...

@gregbunce
Copy link
Member

gregbunce commented Aug 4, 2021

I think you're right. It might be the UT unique ID field. I can work around it. It will cause some scripts to break but it can be worked out. In general I'd like to have nulls over empty stings in all the data, though. I think that conforms more to the standard and is easier when running queries.

@stdavis
Copy link
Member

stdavis commented Aug 4, 2021

I'm not a data guy anymore so take it for what it's worth. But I remember there being a difference between empty (null) and intentionally set as blank ('').

@steveoh
Copy link
Member Author

steveoh commented Aug 4, 2021

I'm not a data guy anymore so take it for what it's worth. But I remember there being a difference between empty (null) and intentionally set as blank ('').

Is that like null vs undefined?

@gregbunce
Copy link
Member

yeah, i go the extra mile each month to convert all the nulls in about 10 different fields to empty string "". it's a pain, but I was told downstream scripts and processes depend on there being empty stings in there (and no nulls). again, nulls would make my life easier, but I don't fully understand the downstream processes dependent on empty strings. I suppose that's step one.

@steveoh
Copy link
Member Author

steveoh commented Aug 4, 2021

It seems more straight forward in scripts to test for null. I do remember arcmap being weird about it depending on the data source.

In a definition query field is NULL is correct right?

In scripts it would be

#: yada yada

string_field, null_field = cursor.next()

if null_field is None:
  #: field is null do something
  pass

if not (string_field and string_field.strip()):
  #: string is null empty or blank, do something
  pass

#: yada yada

I realize we don't live in a perfect world and there might be some heartburn with this. Similar to the domain code issue. But do we agree that it is the right™ thing to do? Or does the null vs undefined argument mean that empty strings should be kept as empty strings but other values can be converted to a real null?

@stdavis
Copy link
Member

stdavis commented Aug 4, 2021

Is that like null vs undefined?

Yes, I view it as the same situation.

Or does the null vs undefined argument mean that empty strings should be kept as empty strings but other values can be converted to a real null?

This would be my vote. But again, I'm not a data guy and it doesn't affect me either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed sweeper new sweeper ideas
Projects
None yet
Development

No branches or pull requests

4 participants