Skip to content

Desktop CSV Serializer#522

Merged
subdavis merged 7 commits into
masterfrom
desktop/csv-serializer
Jan 8, 2021
Merged

Desktop CSV Serializer#522
subdavis merged 7 commits into
masterfrom
desktop/csv-serializer

Conversation

@subdavis
Copy link
Copy Markdown
Contributor

@subdavis subdavis commented Jan 7, 2021

Includes non-trivial build configuration changes. See package.json, tsconfig, and README for explanation.

The change of import Track from 'viame-web-common/track' in the parser (in order to use Track.trackExceedsThreshold) unleashed a lot of hidden trouble. I've understood some new things about the typescript compiler and simplified the tsconfig, and was also able to drop ts-node dependency.

yarn watch:cli will autorebuild the cli, which you can run with yarn divecli.

Required by #493 #509 #487

@subdavis subdavis marked this pull request as ready for review January 7, 2021 15:31
Comment thread client/src/track.ts
if (!this.features[feature.frame].keyframe) {
throw new Error('setFeature must be called with keyframe=true OR to update an existing keyframe');
}
listInsert(this.featureIndex, feature.frame);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BryonLewis this is a really important change that could have consequences if I'm wrong about it.

I've tested it myself, and it looks like, by coincidence, this condition was already always true, but you could call setFeature without a keyframe and it wouldn't be marked as keyframe=true, or get added to the index, but it would behave like one. Added explicit error and associated unit test.

Please check my work here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the json2csv serializer assumes (and indeed, the python version does) that all features found in the json are keyframes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like we force any geometry feature change/addition to have keyframe:true. I like the idea that if you have a feature (geometry) you are keyframe at least used for box interpolation. The only possible hiccup would be a weird decision in the future to mix geoJSON types and interpolation, but that would cause a much larger change elsewhere. I.E: supporting having different keyframes for lines vs polygons, but I don't see this being implemented at all nor would I want to.

@subdavis subdavis force-pushed the desktop/csv-serializer branch from 8ebf973 to 43d85dc Compare January 7, 2021 15:40
@subdavis subdavis requested a review from BryonLewis January 7, 2021 15:53
Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to pass most of my testing and just a few minor comments and the name -> key thing.

Below is just a log of my testing procedure:

  • Multi confidence pairs on tracks works properly, although there is that note about filtering on multi in the future.
  • DIVE Pipelines properly ran and exported afterwards
  • Track Attributes and detection attributes are imported and exported properly (I know we don't support them on desktop but I wanted to test to make sure it worked). Even when introducing spaces in the results.
  • Multi feature exporting was how I noticed the point key being off, but for polys it worked perfectly.

column2,
feature.frame,
...(feature.bounds as number[]),
sortedPairs[0][1],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want a TODO here. Awesome that you're supporting multiple confidence pairs so we can eventually have type dependent confidence thresholds, but that would introduce a bug here later.
fish1: confidence -0.90 | threshold: 0.92 | Not included
fish2: confidence -0.85 | threshold: 0.80 | Included
fish3: confidence -0.35 | threshold: 0.45 | Not included

So the returned type would default to fish1 even though it is excluded
Definitely out of scope for this PR, but a TODO or comment may be nice as a reminder if we go to implement multiple thresholds.

Comment thread client/src/track.ts
if (!this.features[feature.frame].keyframe) {
throw new Error('setFeature must be called with keyframe=true OR to update an existing keyframe');
}
listInsert(this.featureIndex, feature.frame);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like we force any geometry feature change/addition to have keyframe:true. I like the idea that if you have a feature (geometry) you are keyframe at least used for box interpolation. The only possible hiccup would be a weird decision in the future to mix geoJSON types and interpolation, but that would cause a much larger change elsewhere. I.E: supporting having different keyframes for lines vs polygons, but I don't see this being implemented at all nor would I want to.

y.positional('meta', {
description: 'The metadata to parse',
type: 'string',
}).demandOption('meta');
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Confirming that this limits the cli converter to only be used on Desktop generated JSON for right now? Obviously any download JSON from the web wouldn't convert because of this requirement. The meta includes the fps, confidenceFilters and the imagenames used for exporting. This things could be removed to enable support but I believe we want to have image names/video frame timestamps in all outputs anyways.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI is, so far, only used by us developers to test and debug serializers. the information from meta.json is always required. I'm not really interested in coming up with some other means of supplying that data to the serializer, such as implementing directory crawl, girder integration, etc.

row.push(`${PolyToken} ${coordinates.map(Math.round).join(' ')}`);
} else if (geoJSONFeature.geometry.type === 'Point') {
if (geoJSONFeature.properties) {
const kpname = geoJSONFeature.properties.name;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const kpname = geoJSONFeature.properties.name;
const kpname = geoJSONFeature.properties.key;

Fairly certain that should be .key instead of .name
https://github.com/VIAME/VIAME-Web/blob/master/server/viame_server/serializers/viame.py#L290

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted.

Object.values(data).forEach((track) => {
const filters = meta.confidenceFilters || {};
if (!options.excludeBelowThreshold
|| Track.trackExceedsThreshold(track.confidencePairs, filters)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns an array, which will always evaluate true, so there was a major bug here.

In fixing that, I actually implemented multi-confidence-pair filtering so that the TODO you asked for is done.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know how I missed that one, should add testing of exporting with filtered results to my test procedure.

}
writer.write(
[`# Written on ${(new Date()).toLocaleString()} by dive_csv_writer:typescript`],
[`# Written on ${(new Date()).toLocaleString().replace(',', '')} by dive_writer:typescript`],
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

localeString was returning a string with a comma, causing it to get quoted, which is bad.

@subdavis
Copy link
Copy Markdown
Contributor Author

subdavis commented Jan 8, 2021

@BryonLewis comments resolved in 9b5e8f2

I want to have proper testing for this code, but I want to get training working first, and we still want tests for both serializers.

Rather than invent our own testing infrastructure, I think it would be better to have some shared input/output sample files then use the language harness to load and evaluate them. So pytest / jest wrappings around an exhaustive set of CSV and json files.

Copy link
Copy Markdown
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sortedPairs could still be an empty array depending on the confidenceFilter threshold.

Comment thread client/README.md Outdated
Comment on lines +42 to +43
# Watch
yarn watch:cli
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to work?

const pairsAboveThreshold = Track.trackExceedsThreshold(track.confidencePairs, filters);
if (!options.excludeBelowThreshold || pairsAboveThreshold.length) {
/* Include only the pairs that exceed the threshold in CSV output */
const sortedPairs = pairsAboveThreshold.sort((a, b) => a[1] - b[1]);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can technically be an empty array still causing the sortedPairs[0][1] further down to fail.
By default you have exludeBelowThreshold equal to false, so if pairsAboveThreshold.length is 0 it will try to sort and display them. Need to abort the track when the pairsAboveThreshold.length

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can technically be an empty arra

This code allows it, but that would be undefined behavior. There has to be at least 1 confidence pair for every annotation.

Probably the conditional should be like this:

  const confidencePairs = options.excludeBelowThreshold
        ? Track.trackExceedsThreshold(track.confidencePairs, filters)
        : track.confidencePairs;
      if (confidencePairs.length) {
        /* Include only the pairs that exceed the threshold in CSV output */
        const sortedPairs = confidencePairs.sort((a, b) => a[1] - b[1]);

Would that resolve it?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that sortedPairs could be an empty array because pairsAbovethreshold could be empty if all items are above the threshold and the excludeBelowThreshold is set to false.

A quick look and this would be my immediate fix as well.

@subdavis
Copy link
Copy Markdown
Contributor Author

subdavis commented Jan 8, 2021

Resolved again ea34e45

@subdavis subdavis merged commit ed4984d into master Jan 8, 2021
@subdavis subdavis deleted the desktop/csv-serializer branch January 8, 2021 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants