Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modifying loadimages #535

Closed
kuandannyan opened this issue Mar 27, 2013 · 14 comments
Closed

modifying loadimages #535

kuandannyan opened this issue Mar 27, 2013 · 14 comments

Comments

@kuandannyan
Copy link

We are currently working on an extension of the existing loadimages.py that theoretically allows loadimages.py to be able to communicate with an image management database similar to open microscopy environment. Furthermore, the extended loadimages.py module should also be able to retrieve and group images based on the experiment metadata and image location stored in the database.

However, the major problem is that loadimages.py has an extensive amount of dependencies. Could you give us some advice on how can we start?

@LeeKamentsky
Copy link

I think that LoadImages is somewhat overburdened at this point with logic for many cases. If I were connecting to an image management database, I would create a separate module. An image loading module has two parts - in prepare_run() it creates image sets by writing image measurements, in run(), it reads these image measurements and loads the actual images. I have an example image loading module in a tutorial I'm putting together - it creates image sets for images timestamped with the current day:
https://github.com/CellProfiler/CellProfiler/blob/multiprocessing/tutorial/example6b.py

The second approach is not to write any module at all. LoadData takes a .csv where each row is an image set and the columns either contain file and path names or metadata values. I think LoadData is well suited to a LIMS environment and an image management database - its likely that you can craft an SQL query whose output can be saved to a file and used directly with LoadData.

@kuandannyan
Copy link
Author

@LeeKamentsky thank you for the prompt reply. We have considered and tested both solutions. Here are the persistent problems.

  1. We prefer to make another module, but by deploying the new module as cellprofiler plugins. The new module seems cannot access the measurements.py and other dependent modules. We are using a Trunk build of CellProfiler installed in a Windows 7. The new image loading module is placed in the cellprofiler plugin folder set by preference. We didn't do additional configuration other than these.
  2. The second approach works properly, but only if the NAS is mapped to each individual client PC. There is, however, only SFTP running on NAS for data access.

@LeeKamentsky
Copy link

I'm not sure why you can't import the measurements module. Something like

import cellprofiler.measurments

should work. Can you print out your python path at the start of your plugin?

import sys
print sys.path

--Lee

On Tue, Apr 2, 2013 at 6:05 AM, Kuan Yan notifications@github.com wrote:

@LeeKamentsky https://github.com/LeeKamentsky thank you for the prompt
reply. We have considered and tested both solutions. Here are the
persistent problems.

We prefer to make another module, but by deploying the new module as
cellprofiler plugins. The new module seems cannot access the
measurements.py and other dependent modules. We are using a Trunk build of
CellProfiler installed in a Windows 7. The new image loading module is
placed in the cellprofiler plugin folder set by preference. We didn't do
additional configuration other than these.
2.

The second approach works properly, but only if the NAS is mapped to
each individual client PC. There is, however, only SFTP running on NAS for
data access.


Reply to this email directly or view it on GitHubhttps://github.com//issues/535#issuecomment-15766711
.

@kuandannyan
Copy link
Author

@LeeKamentsky
Here is the print out of sys.path
...\cp_plugins
e:\program files\cellprofilertrunk\library.zip
e:\program files\cellprofilertrunk
e:\program files\cellprofilertrunk\site-packages

The reason why I didn't remove the invoking of measurements.py is because it is in the sample code you send me. I thought it has something to do with the recording of image index.

Could you make a very sample module that load image based on a given file path and pass the Image to the rest of pipeline?

@kuandannyan
Copy link
Author

@LeeKamentsky
The measurements.py is working now.

Still, could you make a very sample module that load image based on a given file path and pass the Image to the rest of pipeline?

@LeeKamentsky
Copy link

Here's a module that lets you specify the images to load by name and loads
them. I installed the latest trunk build and tried it out and it works.

Here's the link: (http://www.broadinstitute.org/~leek/loadsomefiles.py)

On Tue, Apr 2, 2013 at 7:54 AM, Kuan Yan notifications@github.com wrote:

@LeeKamentsky https://github.com/LeeKamentsky
Here is the print out of sys.path
...\cp_plugins
e:\program files\cellprofilertrunk\library.zip
e:\program files\cellprofilertrunk
e:\program files\cellprofilertrunk\site-packages

The reason why I didn't remove the invoking of measurements.py is because
it is in the sample code you send me. I thought it has something to do with
the recording of image index.

Could you make a very sample module that load image based on a given file
path and pass the Image to the rest of pipeline?


Reply to this email directly or view it on GitHubhttps://github.com//issues/535#issuecomment-15770802
.

@kuandannyan
Copy link
Author

@LeeKamentsky

The code sample works perfectly. The problem with the early sample code is due to that the definitions of C_FILE_NAME and C_FILE_PATH are missing in the measurements.py. It seems these two constants are moved to loadimages.py according the later sample code.

Since the basic sample is working, I am wondering if it is possible to show me how to configure cellprofiler to include additional python package such as pysftp for sftp or suds for web service.

@LeeKamentsky
Copy link

We don't have a lot of support for adding packages. If you look in "c:\Program Files\CellProfiler", you'll see a directory labeled "site-packages". I think the following technique will work, but since pysftp is not pure python, you may run into problems building pysftp and its dependencies:

create a temporary directory (e.g. c:\Temp) on a machine that has a 64-bit version of python 2.7 installed and create a subdirectory named "site-packages".

Install setuptools (https://pypi.python.org/pypi/setuptools) if not already installed

set PYTHONPATH=c:\Temp\site-packages

python -m easy_install -d c:\Temp\site-packages --always-unzip --always-copy pysftp

Copy the contents of the site-packages directory on the development machine to the site-packages on the user machine that has CellProfiler installed.

I'm wishing you luck here, but definitely, this is not guaranteed to work.

@kuandannyan
Copy link
Author

@LeeKamentsky Thank you for the warning. In fact, I have setup a test environment in eclipse and it works if the pysftp is installed in the site-packages of python 2.7, together with several dependencies. I will try to test it with cellprofiler tomorrow and report back to you about the result.

@kuandannyan
Copy link
Author

@LeeKamentsky To include pysftp doesn't work. The dll is not loaded correctly. But we will now go for option 2: using loaddata module and CSV. Thanks for the help. I will close this issue.

@kuandannyan
Copy link
Author

@LeeKamentsky Thanks for the help in Java-python bridging. The SFTP fetch problem is solved by handover SFTP connection to Java. The web service library suds does work with CellProfiler. You are correct on the fact that CellProfiler can only include pure python library. Now we are current at phase 2 of attempting to make new module for LIMS.

Basically, the phase 2 is to group images based on metadata including plate_index, row_index, col_index, site_index and time_point. The information is retrieved from a database, which is also the combined primary key of each image. Images are grouped based on plate_index, row_index, col_index, and site_index, thus form a time-lapse image sequence. I understand there is a loaddata.py to load CSV as metadata. In our case, the information looks exactly like a CSV file for loadata.py, but we really want to make a seamless module without using CSV.

So How can we duplicate the "group image by metadata" mechanism in loadimages.py in our module based on this information? Could you provide us a sample code?

@LeeKamentsky
Copy link

I've got a version here that shows how to group by plate, well and site metadata: http://www.broadinstitute.org/~leek/loadsomefiles_groups.py

Hopefully you can easily adapt it.

@kuandannyan
Copy link
Author

@LeeKamentsky Fit right into our problem. You have saved the day. The whole problem has bothered me for over two weeks.

A minor question. If metadata information is defined in loadimages, these metadata tags can be used later as a variable for output directory or naming using "insert tag". How can I define these kinds of tags?

@LeeKamentsky
Copy link

Oops - I forgot to add that to the module. You need to add the method:
get_measurement_columns and it will happen automatically. I've updated the
sample at the link above. If you add an image measurement that starts with
"Metadata_", you'll be able to pick it as a tag.

On Thu, Apr 4, 2013 at 2:29 PM, Kuan Yan notifications@github.com wrote:

@LeeKamentsky https://github.com/LeeKamentsky Fit right into our
problem. You have saved the day. The whole problem has bothered me for over
two weeks.

A minor question. If metadata information is defined in loadimages, these
metadata tags can be used later as a variable for output directory or
naming using "insert tag". How can I define these kinds of tags?


Reply to this email directly or view it on GitHubhttps://github.com//issues/535#issuecomment-15914947
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants