Skip to content

VideoDemos

Loic A. Royer edited this page Mar 14, 2024 · 2 revisions

Video Demos:

Not everyone will want, or can, get an API key for the latest and best LLM models, so here are videos showcasing what's possible. You will notice that Omega sometimes fails on its first attempt, typically because of mistaken parameters for functions, or other syntax errors. But it also often recovers by accessing the error message, and reasoning its way to the right piece of code. The videos below were mostly made with GPT-4, and some of the older ones with GPT-3.5.

The latest videos can be found here

In this first video, I ask Omega to make a napari widget to convert images from RGB to grayscale:

1.2_MozartConvertToGrayscale_good_evenfaster.mp4

Of course Omega is capable of holding a conversation, it sort of knows 'who it is', can search the web and wikipedia. Eventually I imagine it could leverage the ability to search for improving its responses, and I have seen doing it a few times:

1.7_SayHelloTellMe_okish_evenfaster.mp4

Following-up from the previous video, I ask Omega to create a new labels layer containing just the largest segment. The script that Omega writes as another rookie mistake: it confuses layers and images. The error message then confuses Omega into thinking that it got the name of the layer wrong, setting it off in a quest to find the name of the labels layer. It succeeds at writing code that searches for the labels layer, and uses that name to write a script that then does extract the largest segment into its own layer. Not bad:

3_Cells3DOtsuThenLabelsSelectLargestSegment_evenfaster.mp4

In this video, I ask Omega to write a 'segmentation widget'. Pretty unspecific. The answer is a vanilla yet effective widget that uses the Otsu approach to threshold the image and then finds the connected components. Note that when you ask Omega to make a widget, it won't know of any runtime issues with the code because it is not running the code itself, yet. It can tell if there is a syntax problem though... Nevertheless, the widget ends up working just fine:

4_CellsSegment_evenfaster.mp4

Now it gets more interesting. Following up on the previous video, can we ask Omega to do some follow- up analysis on the segments themselves? I ask Omega to list the 10 largest segments and compute their areas and centroids. No problem:

5_CellsListSegmentsByAreaTrimmed_evenfaster.mp4

Note: You could even ask for it in markdown format, which would look better (not shown here).

Next I ask Omega to make a widget that lets me filter segments by area. And it works beautifully. Arguably it is not rocket science, but the thought-to-widget time ratio must be in the hundreds when comparing Omega to an average user trying to write their own widget:

6_CellsSegmentsFilterByArea_evenfaster.mp4

This is an example of a failed widget. I ask for a widget that can do dilations and erosions. The widget is created but is 'broken' because Omega made the mistake of using floats for the number of dilations and erosions: (In the next video I tell Omega to fix it)

7_CellsErosionDilationFirstPart_evenfaster.mp4

Following up from previous video, I explain that I want the two parameters ( number erosions and dilations) to be integers. Notice that I exploit the conversational nature of the agent by assuming that it remembers what the widget is about:

8_CellsErosionDilationSecondPartFixed_evenfaster.mp4

This video demos a specialised 'cell and nuclei segmentation tool' which leverages cellpose 2.0 to segment cell cytoplasms or nuclei. In general, we can't assume that LLMs know about every single image processing library, especially for specific domains. So it can be a good strategy to provide such specialised tools. After Omega successfully segments the nuclei, I ask from it to count the nuclei. Answer: 340. Notice that the code generated ' searches' the layer with name 'segmented' with a loop. Cute:

9_CellsSegmentCellPoseCount_evenfaster.mov

Enough with cells. Aparently The 'memory' of ChatGPT is filled with unescessary information, it knows the url of Albert Einstein's photo on wikipedia, and combined with the 'napari file open' tool it can therefore open that photo in napari:

10_EinsteinPhoto_success_evenfaster.mp4

You can ask for rather incongruous widgets, widgets you would probably never write because you just need them once or something. Here I ask for a widget that applies a rather odd non-linear transformation to each pixel. The result is predictably boring, but it works, and I don't think that the answer was 'copy pasted' from somewhere else...

10.5_EinsteinPixelFormula_evenfaster.mp4

In this one, starting again from our beloved Albert, I ask to rename that layer to 'Einstein' which looks better than just 'array'. Then I ask Omega to apply a Canny edge filter. Predictably it uses scikit-image:

11_EinsteinPhotoCannyEdge_evenfaster.mp4

Then I ask for a 'Canny edge detection widget'. It happily makes the widget and offers relevant parameters:

11.5_EinsteinPhotoCannyEdgeWidget_good_evenfaster.mp4

Following up on previous video, I play with dilations on the edge image. Omega has some trouble when I ask to 'do it again'. Fine, sometimes you have a bit more explicit:

12_EinsteinPhotoCannyEdgeMorphological_good_evenfaster.mp4

You can also experiment with more classic 'numpy' code by creating and manipulating arrays and visualising the output live:

12.5_3DArrayFormulaProject_evenfaster.mp4

This video demonstrates that Omega understand many aspects of the napari viewer API. It can switch viewing modes, translate layers, etc... :

13_NapariViewerControl_evenfaster.mp4

I never thought this one would work: I ask Omega to open in napari a mp4 video from a URL and then use OpenCV to detect people. It does it. But the one thing that Omega does not know is that creating a layer for each frame of the video is not a practical approach. Not clear what happened to the colors though. Probably an RGB ordering or format issue:

15_LoadMP4VideoFromURLOpenCVPeopleDetection_evenfaster.mp4