Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lookup() option to print unknown values. #269

Merged
merged 5 commits into from
Nov 22, 2022

Conversation

blackwinter
Copy link
Member

@blackwinter blackwinter commented Nov 16, 2022

I've opted to reuse the file-related options from print_record(), the actual behaviour is triggered by a boolean flag (print_unknown).

Resolves #212.


General issue: We can't differentiate between method options and a local (unnamed) map.

@fsteeg fsteeg assigned blackwinter and unassigned fsteeg Nov 17, 2022
@blackwinter
Copy link
Member Author

Oh, you were too quick ;) I forgot to mention that there's a subtle behaviour change here: If a key is associated with a null value, it's logged as unknown and the default value is returned (if given, otherwise the key is returned); previously, the key would have been returned even with a default value present. This only applies to Java maps, of course, since Fix doesn't have null values.

@fsteeg
Copy link
Member

fsteeg commented Nov 17, 2022

If a key is associated with a null value, it's logged as unknown and the default value is returned (if given, otherwise the key is returned); previously, the key would have been returned even with a default value present.

+1, sounds like an improvement!

@blackwinter
Copy link
Member Author

Heads up: I think this won't work across multiple records (unless a separate file is written for each record, using counter and/or record ID). I have yet to verify, but am pretty sure ;) (see also metafacture/metafacture-core#475)

README.md Show resolved Hide resolved
@blackwinter blackwinter force-pushed the 212-addLookupOptionToPrintUnknownValues branch from e4b4e66 to 356a5e7 Compare November 18, 2022 15:19
@blackwinter
Copy link
Member Author

Fixed printing unknown values across multiple records in 356a5e7.

(Also rebased on top of master, i.e. 5d756eb)

@TobiasNx
Copy link
Collaborator

TobiasNx commented Nov 21, 2022

The append option is a little bit unclear. If append:"false" assumed that the file is overwritten per metafacture run not per record. But it seems that the file is overwritten per record.

See here: #270

@blackwinter
Copy link
Member Author

blackwinter commented Nov 21, 2022

If append:"false" assumed that the file is overwritten per metafacture run not per record.

No, with append mode disabled, the file is overwritten for every lookup() call. Otherwise, it gets appended to instead of overwritten.

But it seems that the file is overwritten per record.

Exactly, that's the expected behaviour.

(See also the unit tests.)

@TobiasNx
Copy link
Collaborator

If append:"false" assumed that the file is overwritten per metafacture run not per record.

anyway to make this a feature?
Use case: In OERSI every day all records are transformed. Therefore append would generate a lot of duplicates because of the daily transformation of each record.

@blackwinter
Copy link
Member Author

anyway to make this a feature?

What do you have in mind? A timestamp in the file name?

Therefore append would generate a lot of duplicates because of the daily transformation of each record.

But you are going to add unknown values over time, aren't you? And then you'd delete the file.

Otherwise, I would suggest to just rotate the file.

@TobiasNx
Copy link
Collaborator

anyway to make this a feature?

What do you have in mind? A timestamp in the file name?

I thought to overwrite just once, the first time the lookup is run.

Therefore append would generate a lot of duplicates because of the daily transformation of each record.

But you are going to add unknown values over time, aren't you? And then you'd delete the file.

Otherwise, I would suggest to just rotate the file.

Yes I am, but I have no direct control over the productive OERSI system. I cannot delete it by my own. The deletion process would be done by the infrastructure and the rotation need be prepared by @fsteeg then.

@blackwinter
Copy link
Member Author

I thought to overwrite just once, the first time the lookup is run.

We can try if that's the requirement.

@fsteeg: Want to weigh in?

@TobiasNx
Copy link
Collaborator

TobiasNx commented Nov 21, 2022

I thought to overwrite just once, the first time the lookup is run.

We can try if that's the requirement.

@fsteeg: Want to weigh in?

If this is to much hastle. Let's go with this as it is for now, and evaluate later if this is a needed feature!

One other thing. I tried to set id to name but the printed output has not the value of name in it:
#270

@blackwinter
Copy link
Member Author

One further thing. I tried to set id to name but the printed output has not the value of name in it: #270

  1. The record ID is not included in the output, it's only meant for inclusion in the file name (destination).
  2. It's actually not implemented as documented right now. Going to fix.

@blackwinter
Copy link
Member Author

Fixed destination and also added prefix option (same as print_record()'s prefix parameter).

@blackwinter
Copy link
Member Author

Let's go with this as it is for now, and evaluate later if this is a needed feature!

That's fine with me, but we should keep in mind that such a change might not be possible in a backwards-compatible manner.

@blackwinter blackwinter merged commit 72196e7 into master Nov 22, 2022
@blackwinter blackwinter deleted the 212-addLookupOptionToPrintUnknownValues branch November 22, 2022 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add lookup option to save not matching values in extra file
3 participants