Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TIFF Metadata for Offsets #19

Merged
merged 16 commits into from Apr 3, 2020
Merged

Add TIFF Metadata for Offsets #19

merged 16 commits into from Apr 3, 2020

Conversation

ilan-gold
Copy link
Member

@ilan-gold ilan-gold commented Mar 26, 2020

Related to geotiffjs/geotiff.js#132 and vitessce/vitessce#489, our TIFF files with lots of image pages (i.e image pyramids, CODEX etc.), need offsets to be given as well so that there is no need to traverse the linked-list structure.

I've also added some other things relating to deployment because testing this requires a python script and not just diffing files. If there's a better way, let me know - my DevOps experience is minimal.

@ilan-gold ilan-gold requested a review from mccalluc March 26, 2020 19:10
Copy link
Contributor

@mccalluc mccalluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to go this way, exactly. More thoughts below... but this is on the right track. Thanks!

Dockerfile Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
'--actual_dir', required=True,
help='directory containing actual ome-tiff outputs')
args = parser.parse_args()
main(args.expected_dir, args.actual_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do appreciate the impulse, I just worry about adding to the burden going forward... now instead of just letting docker do its thing, I need to get things running locally. (And my experience is that these kinds of libraries tend to be finicky.)

Another possibility besides the sed/diff, would be to keep the python tests, but move it to its own repo, and do the testing there: Here, it could just checkout that repo and run with it, no need to publish to pypi or anything like that.

@mccalluc
Copy link
Contributor

Hi, thinking a bit more about this: The first container isn't really that different: For the main output, the arrow file, a diff is really not going to be useful. (It might not even have been stable between platforms, I can't remember.) But to get something that is useful to diff, I think I took the arrow and generated the json from that.

So if that same idea is applied here: Diffing a tiff is not going to give you any useful information if there is an error.... but extracting the xml and saving it alongside the tiff would give us something we could check. Does that sound workable?

@ilan-gold
Copy link
Member Author

I think that could work. In any case, the sed + diff option is cleaner. Hopefully the regex isn't too nasty :)

@ilan-gold
Copy link
Member Author

Is there a better way to do sed on both mac and linux?

Co-Authored-By: Chuck McCallum <mccalluc@users.noreply.github.com>
@ilan-gold ilan-gold requested a review from mccalluc March 31, 2020 15:42
Copy link
Contributor

@mccalluc mccalluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool: Thanks for revising this! I think the python installs you're doing on Travis are now unnecessary? Apart from that it's all little style things you can take, or leave if you disagree, and merge when you think it's ready.

(Most trivial of trivial style points: I have a preference for capitalizing the start of most comments, and if it's a sentence, having periods: Improves readability a bit)

.gitignore Outdated
@@ -1,5 +1,9 @@
**/test-output-actual

.DS_STORE
dask-worker-space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like not to be running python outside the container.

.travis.yml Outdated
before-install:
- sudo apt-get install -y gcc python3-dev
install:
- pip install -r ./containers/ome-tiff-offsets/context/requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: I would really like everything to happen inside the container... I don't want to depend on python outside.


# For tiff packages
RUN apt-get update &&\
apt-get install -y gcc python3-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I was thinking we could use the same base image for everything... but if we end up with a lot of installs for just one pipeline or another, then might not be the right right.

No action for now, but something to keep in mind.

Comment on lines 19 to 21
mkdir(output_dir)
except FileExistsError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mkdir(output_dir)
except FileExistsError:
pass
mkdirs(output_dir, exist_ok=True)

and change the import

containers/ome-tiff-offsets/context/main.py Outdated Show resolved Hide resolved
containers/ome-tiff-offsets/context/main.py Outdated Show resolved Hide resolved
containers/ome-tiff-offsets/context/main.py Outdated Show resolved Hide resolved
containers/ome-tiff-offsets/context/main.py Outdated Show resolved Hide resolved
new_ome_tiff_path = Path(output_dir) / input_path.name
# write the file out
with open(Path(output_dir) / 'ome.xml', 'w') as xml_write:
xml_write.write(str(omexml))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what kind of object omexml is, but is there way to pretty-print it, so a line oriented diff would be useful, if that's ever necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if the object itself doesn't offer this, wrap it in an xml object that does.

test.sh Outdated Show resolved Hide resolved
@mccalluc
Copy link
Contributor

mccalluc commented Apr 1, 2020

Is there a better way to do sed on both mac and linux?

I don't know the precise thing you're running into, but I'm a bit more familiar with perl syntax if I'm trying to do something weird... that might not be helpful for you.

ilan-gold and others added 6 commits April 2, 2020 17:26
Co-Authored-By: Chuck McCallum <mccalluc@users.noreply.github.com>
Co-Authored-By: Chuck McCallum <mccalluc@users.noreply.github.com>
@ilan-gold ilan-gold merged commit e7baa2a into master Apr 3, 2020
@ilan-gold ilan-gold deleted the ilan-gold/tiff_metadata branch April 3, 2020 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants