Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gopro-telemetry - performance question #196

Closed
forna opened this issue Apr 24, 2023 · 31 comments
Closed

gopro-telemetry - performance question #196

forna opened this issue Apr 24, 2023 · 31 comments

Comments

@forna
Copy link

forna commented Apr 24, 2023

I am testing the conversion of a 2 Gb GoPro video using both gpmfExtract and gopro-telemetry in the browser.
The gpmfExtract part takes about 2.5 seconds (with the worker disabled since it's crashing the browser).
The gopro-telemetry needs about 30 seconds.
I have also analyzed the gpmfExtract file by saving it to the disk as Blob, it's 2.68 Mb, so not a large amount of data.
I am using the latest [v1.2.0] version of the code.

Is it normal for gopro-telemetry to be taking so long or am I doing something wrong?
I have also tested with different presets like "gpx" but I don't see any performance gain.

Here is the relevant part of the code:

const videoFileObj = addedVideosFileObjArray[i].file
const cancellationToken = {cancelled: false}
const progress = percent => console.log(`gpmfExtract: ${percent}% processed`)
gpmfExtract(videoFileObj, {browserMode: true, useWorker: false, progress, cancellationToken}).then(res => {
    if (!res) return   // Cancelled
    const progress = percent => console.log(`goproTelemetry: ${percent}% processed`)
    goproTelemetry(res, {preset: "geojson", stream : "GPS", progress}).then(telemetry => {
        const telemetryJson = JSON.stringify(telemetry)
        console.log(telemetryJson)
    })
})

Here is an extract of the browser's console log:

21:13:15.053 MediaAddModal.svelte:61 gpmfExtract: 1% processed
.....
21:13:17.650 MediaAddModal.svelte:61 gpmfExtract: 100% processed
21:13:17.662 MediaAddModal.svelte:68 goproTelemetry: 0.01% processed
21:14:41.446 MediaAddModal.svelte:68 goproTelemetry: 0.2% processed
21:14:48.379 MediaAddModal.svelte:68 goproTelemetry: 0.4% processed
21:14:48.390 MediaAddModal.svelte:68 goproTelemetry: 0.6% processed
21:14:48.395 MediaAddModal.svelte:68 goproTelemetry: 0.9% processed
21:14:48.412 MediaAddModal.svelte:71 {"type":"Feature","geometry":{"type":"LineString","coordinates":
[[14.5326717,50.0264218,285.335],[14.5326715,50.0264203,285.23],... (etc... truncated)
@Akxe
Copy link
Contributor

Akxe commented Apr 24, 2023

Try reading the LRV files instead; they do have the same data but a smaller video resolution (basically a thumbnail). It should lead much faster...

@forna
Copy link
Author

forna commented Apr 25, 2023

Try reading the LRV files instead; they do have the same data but a smaller video resolution (basically a thumbnail). It should lead much faster...

Thanks. Unfortunately that is not a viable option

@JuanIrache
Copy link
Owner

Hi. I don't use the library on a browser, so I don't know if that's slow or not. We can leave this open in case someone else has insights. I wouldn't expect the LRV files to make a difference, as the data stream is the same size, so that should not affect gopro-telemetry, just gpmf-extract

@forna
Copy link
Author

forna commented Apr 26, 2023

Ok, thanks. Apart from this minor issue the library works well, great job!

@meakbiyik
Copy link

Same issue here. I have MP4 files that are around 4GB in size, and gpmfExtract works quite fast, but goproTelemetry takes ages and eats unholy amounts of memory, so much so that I cannot run it on my local device. I did telemetry extraction myself with a Python implementation and that was basically instant, so I am convinced that there must be some bug/misallocation here - though not sure where.

@Akxe
Copy link
Contributor

Akxe commented Jun 12, 2023

Streaming might be the solution for memory footprint. It would also grant user feedback about progress... For how long it takes... who knows why Python is so fast. The original C code is only about 20% faster; thus, it would not be real to try to fix this...

@meakbiyik
Copy link

I cannot follow the code really, there are so many nested loops and components that without properly profiling, I doubt we can say anything about it. I don't think I will have time for it, but if I do, will let you know. Overall I highly suspect that this is a bug, since it simply cannot take so long to parse a couple megabytes of data that does not require backtracking. @Akxe are you sure that the original C code is only 20% faster? Are there any benchmarks in the repo?

@Akxe
Copy link
Contributor

Akxe commented Jun 14, 2023

@meakbiyik I am not; a friend of mine who tested my implementation of GoPro telemetry vs original C and he said that the telemetry itself is not that much slower...

But given you are testing 4GB file, the extraction might be a lot slower...


Also, for me, the GMPF-Extract takes about 70% while the telemetry takes only about 30%...

@forna
Copy link
Author

forna commented Jun 14, 2023

I think there is a misunderstanding about the file size.
The video file can be 2 or 4 GB yes, but the process reading that is the gpmfExtract that is blazing fast, it takes a few seconds.
The gpmfExtract generates a file that is a few MBs in size, in my case it is 2.68 MB.
This 2-3 Mb file is the one that is processed by the gopro-telemetry and that is the culprit for this issue.
Therefore we are testing the gopro-telemetry performance with a 2-3 MB file.

@Akxe
Copy link
Contributor

Akxe commented Jun 14, 2023

I don't think so. It just might be dependent on the file... For me, the extraction is the long process, not the telemetry processing...

@JuanIrache
Copy link
Owner

Hi @forna . Are you able to share the raw data file for analysis?

@forna
Copy link
Author

forna commented Jun 14, 2023

Here is the file, you have to extract it: gpmfExtract.zip
Thanks

@JuanIrache
Copy link
Owner

Thanks for sharing that. Extracting your data to geojson is taking about 1 second on my end, on a very modest mac. Are you able to share a minimal repo to reproduce your exact conditions?

@forna
Copy link
Author

forna commented Jun 15, 2023

OK sure. Do you have any favorite bundler? I am using Svelte that is integrated with Rollup, so I was thinking to use that.

@JuanIrache
Copy link
Owner

Anything that we can test by just running npm install and npm run start

@forna
Copy link
Author

forna commented Jun 18, 2023

I can reproduce the issue with the goproTelemetry chained to the gpmfExtract (so the whole end-to-end process starting with the video file). This is how I do it in my application.
But I wanted to provide the repo with the goproTelemetry only so that the processing starts with the gpmfExtract file.

How have you loaded the binary gpmfExtract file I provided above into goproTelemetry?
I am struggling to convert it into a valid goproTelemetry input.

@JuanIrache
Copy link
Owner

I run this node script:

const goproTelemetry = require(`gopro-telemetry`);
const fs = require('fs');

const rawData = fs.readFileSync('gpmfExtract');

goproTelemetry(
  { rawData },
  { preset: 'geojson', stream: 'GPS', progress: console.log },
  telemetry => {
    fs.writeFileSync('output_path.json', JSON.stringify(telemetry));
    console.log('Telemetry saved as JSON');
  }
);

@forna
Copy link
Author

forna commented Jun 18, 2023

Somehow I am not able to load into goproTelemetry the gpmfextract binary file I have provided.
So I have prepared the code with both the gpmf-extract and the gopro-telemetry (that's how my application works anyway).
Also, for troubleshooting purposes I don't think it would be useful if I provided the built SPA since it would have all libs bundled, minified with tree-shaking.
I suggest to use Svelte with Sveltekit so if you modify anything in the libs the project will be rebuilt automatically.
Here are the steps.

Assuming your projects folder is /projects and this new project's name is performance-test:

In /projects run the following command from the terminal (svelte scaffolding)
npm create svelte@latest performance-test
When prompted you can choose:

  • Which Svelte app template? Skeleton project
  • Add type checking with TypeScript? No
  • Select additional options ? Press enter (nothing required)
cd performance-test
npm install
npm install gpmf-extract gopro-telemetry

In src/routes replace the +page.svelte file content with the one I've provided below

Start the dev server:
npm run dev

The application URL will be:
http://localhost:5173/

Select any GoPro video with the size around 1 GB and click Upload to see the goproTelemetry slowness.
The progress can be seen in the browser's console.

+page.svelte:

<script>
    import { onMount } from 'svelte'
    import { writable } from 'svelte/store'
    import gpmfExtract from 'gpmf-extract'
    import goproTelemetry from 'gopro-telemetry'

    let uploadedFile = writable(null) // Declare uploadedFile as a writable store

    onMount(() => {
        // Update the uploadedFile store with the reference to the uploadedVideo element
        uploadedFile.set(document.getElementById('uploadedFile'))
    })

    function processVideo() {
        if ($uploadedFile.files.length === 0) {
            return
        }
        const videoFile = $uploadedFile.files[0]

        // Set the extract options
        const gpmfExtractOptions = { browserMode: true, useWorker: false }
        gpmfExtractOptions.progress = (percent) => {
            console.log('gpmfExtract progress: ' + percent)
        }
        const goProTelemetryOptions = { preset: 'geojson', stream: 'GPS' }
        goProTelemetryOptions.progress = (percent) => {
            console.log('goProTelemetry progress: ' + percent)
        }
        gpmfExtract(videoFile, gpmfExtractOptions).then(gpmfExtractData => {
            console.log('gpmfExtract completed')
            goproTelemetry(gpmfExtractData, goProTelemetryOptions).then(goproTelemetryData => {
                console.log('goproTelemetry completed')
            }).catch((error) => {
                alert('An error occurred during goproTelemetry processing: ' + error.message)
                console.error(error)
            })
        }).catch((error) => {
            alert('An error occurred during gpmfExtract processing: ' + error.message)
            console.error(error)
        })
    }
</script>

<h1>Test goproTelemetry</h1>
<form id="uploadForm" enctype="multipart/form-data">
    <input type="file" id="uploadedFile" name="video" accept="video/*" required>
    <input type="submit" value="Upload" on:click={() => processVideo()} bind:this={$uploadedFile}>
</form>

@JuanIrache
Copy link
Owner

Hi @forna, Thank you for providing this example. On my end, it sometimes runs fast, sometimes really slow. I haven't really used the library on a browser environment (or used Svelte at all) so I don't know if this is normal. I don't see anything evidently wrong with your implementation, and the same files are parsed almost instantly in a Node environment.

@Akxe
Copy link
Contributor

Akxe commented Jun 19, 2023

That makes sense... The library is heavily using nodeJS APIs that need to be polyfilled for browser... Could you try the older version? The one that was fully supporting of browsers or of the box?

@JuanIrache
Copy link
Owner

The version with full browser support is still the published one. I'm not sure we should keep it that way, but for now, only the Dev branch has been reverted (just the @gmod/binary-parser bits)

@meakbiyik
Copy link

I did a little profiling to see if it helps in understanding the issue. The result of the profiling is attached here (not sure if the raw file is helpful):

Trace-20230619T184600.zip

Overall, here are some stuff I noted:

image

Majority of the time is spent with findLastCC calls, and parseV calls to some extent. Not sure why should findLastCC be triggered so often.

image

findLastCC calls seem to be mostly triggered by the recursive parseKLV calls. ParseV calls also seem to be exclusively recursive, though in both cases it is just a single recursion.

Overall no obvious culprits, though I notice a lot of nested for/while loops in these functions, as well as array pushes.

@forna
Copy link
Author

forna commented Jun 19, 2023

It may be something related to the Buffer lib polyfill.
Before Svelte I was trying to bundle the libs with browserify, but the Buffer lib was the only one that browserify was not able to handle, so I did not manage to provide the browserify repo.
But it works out of the box with Svelte (I am not sure how).

@Akxe
Copy link
Contributor

Akxe commented Jun 19, 2023

To my knowledge, findLastCC is responsible for finding the next part of the data... The format goes like this: VVVVDVVVVDVVVVD where V is video data and D is metadata

@Akxe
Copy link
Contributor

Akxe commented Jun 19, 2023

The

It may be something related to the Buffer lib polyfill. Before Svelte I was trying to bundle the libs with browserify, but the Buffer lib was the only one that browserify was not able to handle, so I did not manage to provide the browserify repo. But it works out of the box with Svelte (I am not sure how).

The current version of the library does not require any nodeJS polyfill. The compiler might yell, that it wants them, but they are optional. Rollup will not say anything; Webpack v5 will make a warning, Webpack v4 will bundle polyfills. I don't know the rest

@JuanIrache
Copy link
Owner

It may be something related to the Buffer lib polyfill. Before Svelte I was trying to bundle the libs with browserify, but the Buffer lib was the only one that browserify was not able to handle, so I did not manage to provide the browserify repo. But it works out of the box with Svelte (I am not sure how).

If you are polyfilling and suspect Buffer is problematic, try using the dev branch:

npm i juanirache/gopro-telemetry#dev

@forna
Copy link
Author

forna commented Jun 20, 2023

As Akxe mentioned there is no polyfilling required.
The dev branch has libs not supported by the browser so it's not working.

@HDv2b
Copy link

HDv2b commented Nov 5, 2023

I'm wondering what latest info is on this. I'm trying to shift server work to the client to make is as offline-friendly as possible, and finding that for large files, the processing.. we'll let's say I've been unable to tell if it's crashed or just taking a long time.

for example:
406MB video (4:06 mins @ 1920 * 1080 * 29.97fps)
extracts: instant, (1,645,084 bytes)
process:
0.01 - 0.2 = 30s
0.2 - 0.4 = 5s
0.4 - complete = 16s
2,469 coordinates

5.67GB video (1:12:30 @ 1920 * 1080 * 29.97fps)
extracts: 23.5s, (29,029,180 Bytes)
process:
0.01 - ??? ~ 20 minutes no movement

times are obviously only so accurate as it's running in browser.

my code, running in web worker in browser:

import gpmfExtract from 'gpmf-extract';
import { goProTelemetry } from 'gopro-telemetry';

type FileSendType = { type: 'file'; file: File; videoId: string };
type FileCancelType = { type: 'cancel' };

const cancellationToken = { cancelled: false };

addEventListener(
  'message',
  (event: MessageEvent<FileSendType | FileCancelType>) => {
    switch (event.data.type) {
      case 'file': {
        // browser has sent message with video file as payload, as well as a video ID so that any info back to the browser is labelled to the right video
        const videoId = event.data.videoId;
        gpmfExtract(event.data.file, {
          browserMode: true,
          progress: (extractionProgress) => {
            return postMessage({ videoId, extractionProgress });
          },
          cancellationToken,
        })
          .then((res) => {
            postMessage({ videoId, extractComplete: res });

            goProTelemetry(res, {
              stream: ['GPS'],
              debug: true, // tried playing with this...
              raw: true, // this...
              tolerant: true, // this...
              preset: 'geojson', // ...and this, and didn't notice anything different
              progress: (processingProgress) =>
                // update user of progress
                postMessage({ videoId, processingProgress }),
            })
              .then((res) => {
                postMessage({ videoId, processComplete: res });
              })
              .catch((e) => {
                postMessage({ videoId, error: e.message });
              });

            if (!res) return;
          })
          .catch((e) => {
            postMessage({ videoId, error: e.message });
          });
        break;
      }

      case 'cancel': {
        // "cancel" button clicked by user
        cancellationToken.cancelled = true;
        break;
      }
    }
  }
);

I'm also not getting any kind of errors thrown in browser console.

When I view performance I see a long brown line, which zoomed in, turns out to be a whole lot of garbage collection, not sure that to make of that:

Major GC

Sorry that's all I really have to help, let me know what else I can do

@HDv2b
Copy link

HDv2b commented Nov 5, 2023

The thing is when running in Node, the extraction and processing combined for the same 5.67GB video takes 30 seconds, while browser side sits idle for at least 20 mins, so something in the browser version is definitely ending prematurely or getting stuck in a loop. By comparison, the 406MB video which takes about 50s in browser is done in less than 2 seconds in Node.

Forgot to point out, this is with:

    "gopro-telemetry": "^1.2.3",
    "gpmf-extract": "^0.3.1",

I also tried with and without egm96-universal installed, but that made no difference.

I also just tried the svelte example above (I'm using react/next in my example) and still seeing same results as described by @forna , in his version as with my own.

@HDv2b
Copy link

HDv2b commented Nov 5, 2023

@forna FWIW, I'm noticing a huge speed difference in your svelte example between Firefox vs Chrome/Edge, FF being way slower just to extract, let alone process the telemetry. Are you seeing this and do you think it proved any clues?

FF @ 930 ticks per %
image

Edge @ ~30-40 ticks per %
image

@forna
Copy link
Author

forna commented Nov 5, 2023

Hi, unfortunately I am not using goproTelemetry anymore since I was not able to make it work properly in the browser.
I have developed my own script but it only extracts a small subset of the available metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants