Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image processing updates #6

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

padma18-vb
Copy link
Contributor

Fully functional ProcessImage.py creates images and hdf5 files for one site.

@ljlamarche ljlamarche self-requested a review July 26, 2021 19:25
Copy link
Contributor

@ljlamarche ljlamarche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this looks pretty good. There are a few places that don't use context management to open files, but those might be outdated (in scripts we don't need anymore). Calibration things we're also still figuring out. My last comment is regarding create_hdf5. Does it make sense for it to open and read in all the PNG files as a completely independent module, or would it be better to have MangoImage.py return arrays in addition to saving them to the png files, then collect them and save that combined array directly?

Comment on lines 30 to 35
f = open(siteInfofname,'r')
siteText = f.read()
s = StringIO(siteText)
siteData = numpy.genfromtxt(s, dtype="|S", delimiter = ',', autostrip=True)

siteData = numpy.genfromtxt(s, dtype="|S", delimiter = ',', autostrip=True)

rows = siteData[:,0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done with context managers (i.e., the 'with' statement)? Also, can this be simplified with csv?

Copy link
Contributor Author

@padma18-vb padma18-vb Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is changed in ProcessImage.py, I didn't edit anything in this file. Sorry, I think I should've been more careful with deleting unopened files!

Comment on lines 62 to 74
# get Site name based on folder being processed
if rawFolder[0][0] == 'H':
siteName = 'Hat Creek Observatory'
else:
if siteDir[0] == 'C':
if siteDir[0] == 'C':
siteName = 'Capitol Reef Field Station'
else:
else:
if siteDir[0] == 'I':
siteName = 'Eastern Iowa Observatory'
else:
if siteDir[0] == 'B':
siteName = 'Bridger'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be replaced with a dictionary indexed by unique site ID? I believe we're transitioning to three letter codes that should be easier to handle.

Copy link
Contributor Author

@padma18-vb padma18-vb Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make any modifications to this file. I think sites are processed differently in ProcessImage.py?

MANGOimage.py Outdated

self.loadFITS()
except ValueError:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this do anything or print an error statement, or is it expected to fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should print an error statement - currently it just exits creating the object. I'll add that to it!

MANGOimage.py Outdated
self.newIMatrix = genfromtxt(newIFilename, delimiter=',')
self.newJMatrix = genfromtxt(newJFilename, delimiter=',')
# print self.newIMatrix.shape, self.newJMatrix.shape
f = open(self.rawImageAddress, "rb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can file be opened/closed with context managers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I can modify it to use context managers, but isn't the functionality the same?

MANGOimage.py Outdated
Comment on lines 115 to 116
maxIndex = np.argmin(abs(cdf - 0.99 * max_cdf))
minIndex = np.argmin(abs(cdf - 0.01 * max_cdf))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact cutoff values should ideally come from a config file and be passed to the function, rather than hardcoded here.

MANGOimage.py Outdated

# Equalization
img_eq = exposure.equalize_hist(self.imageData, nbins=10000)
p1, p99 = np.percentile(self.imageData, (1, 99))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, avoid hardcoding cutoff values.

ProcessImage.py Outdated


def date_formatter(originalName):
dates = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, we no longer have use for this function, I changed it so that we use strptime, I can delete this function.

@@ -0,0 +1,10 @@
Site Name,Site Abbreviation,Center Longitude,Center Latitude
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update all of these to use the 3-letter site ID codes, then use those codes to identify sites throughout the code. Full names should be reserved for printing titles on plots and similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we only use the 3-letter ID codes in the code, and I'll upload the edited version of SiteInformation.csv that contains the 3-letter codes.

images = np.array(images)

# save hdf5 file
f = h5py.File(filename, 'w')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be possible to compact this buy saving a lot of the metadata in a dictionary first.

@padma18-vb
Copy link
Contributor Author

In general, this looks pretty good. There are a few places that don't use context management to open files, but those might be outdated (in scripts we don't need anymore). Calibration things we're also still figuring out. My last comment is regarding create_hdf5. Does it make sense for it to open and read in all the PNG files as a completely independent module, or would it be better to have MangoImage.py return arrays in addition to saving them to the png files, then collect them and save that combined array directly?

Currently, ProcessImage.py creates a hdf5 file for a day once all the images in that day have been processed - I can instantiate an empty dictionary in ProcessImage.py and populate it with the image name and image array (returned from MANGOImage.py) and pass that into create_hd5f_files.py so that its just saved to the file, rather than opening the image + reading in the array in create_hdf5_files.py again. Does that sound okay?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants