Improve metadata #12

kurtwheeler · 2017-05-10T20:41:41Z

This PR got out of hand. The original scope of it was to improve the metadata we were collecting about batches of data from Array Express. However halfway through completing that it was decided that we should change the batch size for Array Express from an experiment to a single sample. Since the batches the metadata was about were changing, it didn't seem to make sense to change that metadata without also changing the batch size. However in the process of doing that I discovered that it's not possible to download a single batch worth of data (i.e. a single sample). Therefore I needed to also refactor the assumption that one downloader job==one batch. This was a large change that affected a lot of the existing code. I also added a lot of new test code to previously untested functionality, because up until these changes that functionality was changing so much that I was hesitant to write tests for it.

So that's the story of why I now have so many changes in a single PR. I'm not entirely sure how to proceed with this PR. I don't think that too much of it can be broken out into smaller PRs. The only large chunk of code that makes sense on its own is the organism model code. However breaking this out will make things somewhat more complicated because the addition of the organism model code prompted me to break the models.py file up into several files contained in a module. I then made changes to other models as I implemented the rest of the functionality in this branch.

Anyway... I'm happy to do whatever makes this PR more digestible for reviewers. If that means breaking out the organism model code into it's own PR and duplicating the split of models.py in two PRs I can do that. If it's easier to leave it in this PR but have a separate reviewer look at the organism model code I can do that.

… installing.

…s and adds a test for it.

…start-testing

…sh a bit.

…iles they're testing.

…s to READMEs.

…y into first-downloader

… the records can be created before they have been picked up.

mhuyck · 2017-05-23T15:49:40Z

It probably makes sense for me to look at the workers/, etc., group since that includes the code that interacts with ArrayExpress, and I have some experience with that from the adage-server project.

dongbohu

I reviewed all files in data_models directory. Overall it is written very well. I found a few places that may need clarifications and some coding style issues. One of the main reasons why I am so obsessed with docstring in a Python class is that in the future we can compile a README file dynamically with tools such as sphinx.

Something that is not directly related to this PR: Since you are expanding the django models on this project, do you have a database schema online somewhere? That will make both code review and your future life easier. As an example, here is adage web server's database schema drawn by Casey, Rene, Matt and me:
https://sketchboard.me/rzMAA9ecMFEh
(It is up to you which tool you want to use to draw it. We used sketchboard because Casey bought it.)

dongbohu · 2017-05-25T19:27:08Z

data_models/data_refinery_models/models/surveys.py

+from data_refinery_models.models.base_models import TimeTrackedModel
+
+
+class SurveyJob(TimeTrackedModel):


Can you add a docstring for this class too? (If you feel it is obvious, just one line will be good enough.) Thanks.

dongbohu · 2017-05-25T19:29:59Z

data_models/data_refinery_models/models/batches.py

+from data_refinery_models.models.surveys import SurveyJob
+
+
+class BatchStatuses(Enum):


A docstring of class description will be nice.

dongbohu · 2017-05-25T19:30:11Z

data_models/data_refinery_models/models/batches.py

+    PROCESSED = "PROCESSED"
+
+
+class Batch(TimeTrackedModel):


A docstring of class description will be nice.

dongbohu · 2017-05-25T19:41:46Z

data_models/data_refinery_models/models/batches.py

+    survey_job = models.ForeignKey(SurveyJob, on_delete=models.PROTECT)
+    source_type = models.CharField(max_length=256)
+    size_in_bytes = models.IntegerField()
+    download_url = models.CharField(max_length=2048)


Just curious: Is this download URL long enough? (I know that many browsers supports a URL as long as 4K.)

Hmm I suppose it'd be better to have an unnecessarily large field here than have to make a migration later just to extend this.

dongbohu · 2017-05-25T19:58:40Z

data_models/data_refinery_models/models/jobs.py

+from data_refinery_models.models.batches import Batch
+
+
+class ProcessorJob(TimeTrackedModel):


A docstring of class description will be nice.

dongbohu · 2017-05-25T20:51:26Z

data_models/data_refinery_models/models/organism.py

+            try:
+                taxonomy_id = get_taxonomy_id_scientific(name)
+                is_scientific_name = True
+            except UnscientifcNameError:


If UnscientifcNameError is a typo, it should be corrected here too.

dongbohu · 2017-05-25T20:55:18Z

data_models/data_refinery_models/models/surveys.py

+
+class SurveyJobKeyValue(TimeTrackedModel):
+    """
+    This table is used for tracking fields onto a SurveyJob record that


It might be better to use "This class" instead of "This table", since the table is implementation details hidden in the database (with a slightly different name), and the docstring here refers to the current class.

dongbohu · 2017-05-25T21:07:04Z

data_models/data_refinery_models/models/test_organisms.py

+            call(ESEARCH_URL,
+                 {"db": "taxonomy", "term": "HUMAN"})])
+
+        # The first call should have stored it in the database so


I suppose "it" here refers the requested organism? Do you mind elaborating this line a little bit to make the comment more readable? Thanks.

dongbohu · 2017-05-25T21:07:33Z

data_models/data_refinery_models/models/test_organisms.py

+            {"db": "taxonomy", "id": "9606"}
+        )
+
+        # The first call should have stored it in the database so


See my previous comment on "it".

dongbohu · 2017-05-25T21:09:19Z

data_models/data_refinery_models/models/test_organisms.py

+        new_id = Organism.get_id_for_name("BLAH")
+
+        self.assertEqual(new_id, 0)
+        mock_get.assert_not_called()


Great test cases overall! 💯

dongbohu · 2017-05-25T21:24:27Z

data_models/data_refinery_models/migrations/0001_initial.py

@@ -1,5 +1,5 @@
 # -*- coding: utf-8 -*-
-# Generated by Django 1.11 on 2017-05-02 18:50
+# Generated by Django 1.10.6 on 2017-05-23 15:17


I skipped this file, is it okay?

Yep, it was automatically generated.

dongbohu · 2017-05-26T15:53:39Z

data_models/data_refinery_models/models/base_models.py

+from django.utils import timezone
+
+
+class TimeTrackedModel(models.Model):


Just realized that this class needs docstring too. Thanks.

dongbohu · 2017-05-26T15:57:55Z

data_models/data_refinery_models/models/batches.py

+from data_refinery_models.models.surveys import SurveyJob
+
+
+class BatchStatuses(Enum):


Docstring please.

dongbohu

Two more places for docstrings. Nothing else. Thanks.

dongbohu

I approve the code that I reviewed in this PR.

rzelayafavila · 2017-05-26T20:20:12Z

Ok cool, I am reviewing the files in the foreman/ directory.

rzelayafavila · 2017-05-26T20:32:34Z

data_models/data_refinery_models/models/organism.py

+    def get_name_for_id(cls, taxonomy_id: int) -> str:
+        try:
+            organism = (cls.objects
+                        .filter(taxonomy_id=taxonomy_id)


Is there any case when the taxonomy_id would not be unique?

Yes. If the same organism was listed with two different names in two different batches but they both corresponded to the same NCBI taxonomy_id (since NCBI has a list structure for names the same ID can match multiple names), then there would be one record for each of those names, however they would share an id.

Hmmm... I think it would be interesting to see if that just corresponded to different names for the same actual organism object vs. different names for different organism objects (but then why would they have the same taxonomy ID)? Are there any examples in the stuff that was returned from the NCBI API of this?

I'm not quite sure I understand what you're asking. What do you mean by an organism object?

rzelayafavila

Added a couple more comments & questions, but will continue reviewing the other files.

rzelayafavila · 2017-05-26T20:46:18Z

foreman/data_refinery_foreman/surveyor/array_express.py

@@ -16,68 +17,111 @@
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)

+EXPERIMENTS_URL = "https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/"


Again here - would it make sense for the users to be able to override this in their settings.py file?

I don't think it would because if that URL ever changes then I think there will be additional changes that need to happen to work with that URL. We can also just change this in the code instead of in the settings.py file if we need to.

rzelayafavila · 2017-05-30T21:24:12Z

data_models/data_refinery_models/models/organism.py

+    def get_name_for_id(cls, taxonomy_id: int) -> str:
+        try:
+            organism = (cls.objects
+                        .filter(taxonomy_id=taxonomy_id)


Hmmm... I think it would be interesting to see if that just corresponded to different names for the same actual organism object vs. different names for different organism objects (but then why would they have the same taxonomy ID)? Are there any examples in the stuff that was returned from the NCBI API of this?

rzelayafavila · 2017-05-30T21:39:24Z

data_models/data_refinery_models/models/organism.py

+
+NCBI_ROOT_URL = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
+ESEARCH_URL = NCBI_ROOT_URL + "esearch.fcgi"
+EFETCH_URL = NCBI_ROOT_URL + "efetch.fcgi"


I think that sounds reasonable for the organisms - if you think this is a better fit then go for it.

In terms of the URLs, I think that at some point we might want to get this information from another URL endpoint. Also, what if we have a saved XML file with the organism information we want and want to pass it to these methods and model? What parameters would we need to abstract to make this possible?

mhuyck

Okay, I've made it through the workers and other files. I have several questions and suggestions, but overall this looks like a major step forward. Looking forward to seeing this thing work!

mhuyck · 2017-05-30T17:20:05Z

data_models/data_refinery_models/migrations/0001_initial.py

@@ -1,5 +1,5 @@
 # -*- coding: utf-8 -*-
-# Generated by Django 1.11 on 2017-05-02 18:50
+# Generated by Django 1.10.6 on 2017-05-26 15:12


I don't usually look at migration files, but happened to notice: is this switch from Django 1.11 to 1.10.6 intentional?

I did this a while ago so I'm not 100% certain what it was but I believe I found a bug that was introduced in 1.11. I wish I had left a comment in the requirements.in file.

Sounds intentional then. 👍

mhuyck · 2017-05-30T18:02:25Z

run_rabbitmq.sh

@@ -0,0 +1,3 @@
+#!/bin/bash
+
+docker run -d --hostname my-rabbit --name some-rabbit rabbitmq:3


These names my-rabbit and some-rabbit look like defaults from a tutorial. Would it make sense to use something that ties this service in with its function in this implementation a little more? I'm assuming that these are just names here... if not, then the rabbitmq team is having way more fun than I would expect. 😄

mhuyck · 2017-05-30T19:44:07Z

workers/Dockerfile.tests

+RUN groupadd user && useradd --create-home --home-dir /home/user -g user user
+WORKDIR /home/user
+
+COPY workers/environment.yml .


Question: does this copy of the environment.yml file ever get used again somewhere else? I'm wondering why we would make a copy of the file and then run it rather than just modify the RUN command below to say

RUN conda env create -f workers/environment.yml

Maybe there's a good reason, but it seems cleaner to me not to have multiple copies lying around.

This COPY command is not just copying the file from one directory to another, but rather from the directory that the Dockerfile is being run from into the docker container. Docker containers start off completely empty so

RUN conda env create -f workers/environment.yml

wouldn't work because the container can't access a file until it's copied into the container.

That was the piece I was missing. Thanks! Makes sense now.

mhuyck · 2017-05-30T19:49:05Z

workers/Dockerfile.tests

+
+ENV PATH "/opt/conda/envs/workers/bin:$PATH"
+
+COPY data_models/dist/data-refinery-models-* data_models/


I have a similar question to the one above about why we copy all of the files and then (seemingly) only ever refer to the (ls data_models -1 | tail -1)th file... maybe I'm not quite following what happens with the pip install at line 15 below.

The answer to this question is similar to the answer to the one above it. Because the RUN command is always executed within the context of the container I cannot run (ls data_models -1 | tail -1) in the Dockerfile and have it interact with files/directories within the host machine without first copying the files into the container. If it were possible, I certainly would agree that figuring out the correct file and THEN copying it into the container would be greatly preferable.

mhuyck · 2017-05-31T14:13:49Z

workers/data_refinery_workers/downloaders/array_express.py

+        raise
+    finally:
+        zip_ref.close()
+        os.remove(file_path)


Do you mean to remove the .zip file if an exception occurs during extraction? Maybe this line should go after line 55 above? I guess in that case the zip_ref.close() would also have to happen first...

Anyway, just want to be sure that deleting the .zip on an exception is desired. If not, some change to the logic here is warranted.

I think this is correct because if extraction fails then the downloader job will fail and need to be retried. When that happens the zip file will be re-downloaded so it won't be an issue that it's been deleted. Additionally the worker that picks up the new downloader job won't necessarily be the same one so the zip file could end up sitting on the original worker's disk forever.

Good plan, then. Thanks for the clear explanation!

I was thinking that it would be easier to diagnose a failure if the .zip file were left behind, but automated re-trying makes sense. If the error is repeatable it could probably be diagnosed by a newly downloaded .zip file as well.

mhuyck · 2017-05-31T15:47:55Z

workers/data_refinery_workers/processors/test_utils.py

+        processor_job.save()
+
+        kwargs = utils.start_job({"job": processor_job})
+        self.assertFalse(kwargs["success"])


It's not obvious to me why this should fail. Can you make a comment about why this would be expected here?

mhuyck · 2017-05-31T15:54:14Z

workers/data_refinery_workers/processors/test_utils.py

+        utils.run_pipeline(job_dict, [mock_processor])
+        mock_processor.assert_called_once()
+        processor_job.refresh_from_db()
+        self.assertFalse(processor_job.success)


Why does this fail?

Oh, this time I get it. Just caught line 135.

mhuyck · 2017-05-31T15:58:02Z

workers/data_refinery_workers/processors/test_utils.py

+
+    def test_value_passing(self):
+        """The keys added to kwargs and returned by processors will be
+        passed through to other processors."""


Multi-line docstring should have ending """ on a line by itself... see above.

mhuyck · 2017-05-31T16:00:04Z

workers/data_refinery_workers/processors/test_utils.py

+        mock_dict = {"something_to_pass_along": True,
+                     "job": processor_job,
+                     "batches": [batch]}
+        mock_dict["something_to_pass_along"] = True


This line is redundant with the end of line 156. Did you mean to leave it here?

I don't think so, good catch.

mhuyck · 2017-05-31T16:03:15Z

workers/data_refinery_workers/processors/utils.py

-    except Batch.DoesNotExist:
-        logger.error("Cannot find batch record with ID %d.", job.batch_id)
+    batch_relations = ProcessorJobsToBatches.objects.filter(processor_job_id=job.id)
+    batches = list(map(lambda x: x.batch, batch_relations))


This line can be simpler as described above:

batches = [br.batch for br in batch_relations]

rzelayafavila

I have a few questions and suggestions, mainly about list and set comprehensions and how errors are handled. Thanks @kurtwheeler!

rzelayafavila · 2017-06-01T14:35:19Z

README.md

+the line length limit for the autopep8 and flake8 linters. If you run either
+of those programs from anywhere within the project's directory tree they will
+enforce a limit of 100 instead of 80. This will also be true for editors which
+rely on them.


Nice documentation! 👍

rzelayafavila · 2017-06-01T14:41:31Z

foreman/data_refinery_foreman/surveyor/array_express.py

+            SurveyJobKeyValue
+            .objects
+            .get(survey_job_id=self.survey_job.id,
+                 key__exact="experiment_accession_code")


What will happen here if no SurveyJobKeyValue objects match the survey_job_id and key passed? Do we just expect django to return that error to the users in the command line?

This survey function is actually called by surveyor.run_job within a try block. If there's no match here the exception will be caught, logged, and the survey_job will be marked as a failure. Given that an experiment_accession_code is required here I think that's the appropriate behavior.

Ok, that sounds good

rzelayafavila · 2017-06-01T14:43:16Z

foreman/data_refinery_foreman/surveyor/array_express.py

+        experiment = self.get_experiment_metadata(experiment_accession_code)
+
+        r = requests.get(SAMPLES_URL.format(experiment_accession_code))
+        samples = r.json()["experiment"]["sample"]


I'm also curious of what the expected behavior is here if the API is down or you get a response that cannot be serialized into json - do you let the Python exception appear to the user in the command line?

My answer here is basically the same as the one above except that I'll add that at some point surveying will be triggered by a foreman who will be responsible for starting and retrying survey_jobs. If the API is down and the job fails then once I get around to writing the foreman code it'll have mechanisms to retry the survey_jobs a few times so hopefully the API will come back up before the retry limit is reached.

rzelayafavila · 2017-06-01T15:10:00Z

foreman/data_refinery_foreman/surveyor/array_express.py

+                ))
+
+        # Group batches based on their download URL and handle each group.
+        download_urls = set(map(lambda x: x.download_url, batches))


I think @mhuyck had a similar suggestion in his review. I think you can do a set comprehension for this line to make it simpler:
download_urls = {bt.download_url for bt in batches}

rzelayafavila · 2017-06-01T15:16:02Z

foreman/data_refinery_foreman/surveyor/array_express.py

+        # Group batches based on their download URL and handle each group.
+        download_urls = set(map(lambda x: x.download_url, batches))
+        for url in download_urls:
+            batches_with_url = list(filter(lambda x: x.download_url == url, batches))


Likewise, I think you can do:
batches_with_url = [bt.download_url for bt in batches if bt.download_url == url]

rzelayafavila · 2017-06-01T15:17:56Z

foreman/data_refinery_foreman/surveyor/array_express.py

+                organism_id = Organism.get_id_for_name(organism_name)
+
+            for sample_file in sample["file"]:
+                if sample_file["type"] != "data" and sample_file["name"] is not None:


Would there be any case where sample_file["name"] would be False or "" instead of None, but we would still not want the sample_file?

Ok this line of code confused me. I can't tell why we would ever want to continue if sample_file["name"] is None since we use that value later and call split on it. I think it's probably just a bug so I'm going to remove everything from and onward.

rzelayafavila · 2017-06-01T15:20:27Z

foreman/data_refinery_foreman/surveyor/management/commands/survey.py

+"""
+This command will create and run survey jobs for each experiment in the
+experiment_list. experiment list should be a file containing one
+experiment accession code per line.


rzelayafavila · 2017-06-01T15:22:45Z

foreman/data_refinery_foreman/surveyor/management/commands/survey.py

+    def handle(self, *args, **options):
+        if options["experiment_list"] is None:
+            logger.error("You must specify an experiment list.")
+            return 0


Doesn't a 0 return value mean that a function has succeeded?

Oh yeah, good catch.

rzelayafavila · 2017-06-01T15:28:19Z

foreman/requirements.txt

@@ -7,11 +7,16 @@
 amqp==2.1.4               # via kombu
 billiard==3.5.0.2         # via celery
 celery==4.0.2
-django==1.10.6
+django==1.11


I see you are running different versions of django in different parts of this project. I'm just curious, any reason for that?

Hmm, whoops. I don't think I have a reason to be doing that. I think I may have run into a bug that was introduced in 1.11 though so that's why most of the project is 1.10.6. I'll change this to match that.

rzelayafavila · 2017-06-01T15:31:28Z

foreman/data_refinery_foreman/surveyor/test_surveyor.py

@@ -26,7 +26,7 @@ def test_calls_survey(self, survey_method):
        job.save()
        surveyor.run_job(job)

-        survey_method.assert_called_with(job)
+        survey_method.assert_called_with()


Hmmm, now that this method isn't taking this argument, I'm wondering what exactly it is doing. Should it be called assert_called_with_job or something of the sort? Also, where is it defined again?

This should just be assert_called. The survey_method Mock is passed in by the @patch decorator, which is mocking surveyor.ArrayExpressSurveyor's survey method.

The reason I no longer am passing job to this method is that the ArrayExpressSurveyor object has the job as an instance variable, so it doesn't need to have it passed in.

Ok, if you change it to assert_called, I think that it makes sense

mhuyck · 2017-06-02T14:57:49Z

run_rabbitmq.sh

@@ -0,0 +1,3 @@
+#!/bin/bash
+
+docker run -d --hostname rabbit-queue --name message-queue rabbitmq:3


mhuyck · 2017-06-02T15:01:12Z

workers/data_refinery_workers/downloaders/array_express.py

-                target_file_path,
-                batch.id,
-                job_id)
+    if success:


mhuyck · 2017-06-02T15:01:25Z

workers/data_refinery_workers/downloaders/array_express.py

-        request.close()
+            # The files for all of the batches in the grouping are
+            # contained within the same zip file. Therefore only
+            # download the one.


mhuyck · 2017-06-02T15:04:39Z

workers/data_refinery_workers/processors/test_utils.py

+        self.assertEqual(len(kwargs["batches"]), 2)
+
+    def test_failure(self):
+        """Fails because there are no batches for the job."""


mhuyck · 2017-06-02T15:12:26Z

workers/data_refinery_workers/downloaders/array_express.py

-        utils.end_job(job, None, False)
-        return
+    batch_relations = DownloaderJobsToBatches.objects.filter(downloader_job_id=job_id)
+    batches = [br.batch for br in batch_relations]


mhuyck · 2017-06-02T15:15:16Z

workers/data_refinery_workers/processors/utils.py

-    except Batch.DoesNotExist:
-        logger.error("Cannot find batch record with ID %d.", job.batch_id)
+    batch_relations = ProcessorJobsToBatches.objects.filter(processor_job_id=job.id)
+    batches = [br.batch for br in batch_relations]


mhuyck · 2017-06-02T15:18:01Z

Except for a handful of """ at the end of multi-line docstrings that should be on a line by themselves, you got everything I noted. Take one more pass to find those before you merge, but everything else I reviewed looks good now. Nice work, @kurtwheeler!

kurtwheeler · 2017-06-02T16:02:33Z

OK @rzelayafavila I just pushed the last change you requested I think. Is it lookin good?

rzelayafavila

Ok, LGTM @kurtwheeler! Good stuff!

kurtwheeler added 30 commits April 6, 2017 12:27

Adds a DATABASE_TIMEOUT env var and saves the state of iptables after…

9b202ac

… installing.

Forces created_at and updated_at to be identical for TimeTrackedModel…

a9bf36c

…s and adds a test for it.

Clean up dockerfiles slightly.

983d098

Adds shell dockerfile and run script.

65f283e

Minor stylistic comment cleanup.

e51d9eb

Merge branch 'master' of github.com:data-refinery/data_refinery into …

4b87fae

…start-testing

Cleans up surveyor code a bit.

a486465

Fixes a few issues with surveyor.py

5bcb693

Sets up test infrastructure for foreman project.

b57f0f4

Adds test for the survey method.

b5f92d4

Fix a bug in array express surveyor and add tests for it.

c4832ea

Rename array_express_surveyor to array_express.

88fe4c9

Changes print statements to logging, also beefs up foreman/run_tests.…

e38e079

…sh a bit.

Add comment to test dockerfile and move test files to be beside the f…

7dbcf18

…iles they're testing.

Remove implicit requires.

9d7849f

Move most dockerfiles and scripts into their respective projects. Add…

df7aa55

…s to READMEs.

Working on getting the downloaders moved to the right spot.

63fb77b

Remove overly specific file URL.

6c1e3cb

Merge branch 'start-testing' of github.com:data-refinery/data_refiner…

86cab15

…y into first-downloader

Set the shell to use workers and set db timeout for workers.

6acab42

Gets downloaders into right location and imports for them working.

ab1bb62

Makes worker_id able to be null for downloader/processor jobs so that…

1cb8507

… the records can be created before they have been picked up.

Makes a command for queuing celery tasks from.

ee39161

Downloader is downloading.

aab355c

Cleaning up.

111d0bf

Removes old test tasks.

68635e4

Some cleanup and clarification based on PR feedback.

9dafe76

Use logger.info not logging.info.

9916244

Minor improvements.

dc78219

Reset Dockerfile.shell to use foreman.

c5c204c

dongbohu reviewed May 25, 2017

View reviewed changes

Incorporates Dongbo's PR feedback.

48ac378

dongbohu reviewed May 26, 2017

View reviewed changes

Add docstrings for a BatchStatuses and TimeTrackedModel.

82e3a1f

dongbohu approved these changes May 26, 2017

View reviewed changes

rzelayafavila reviewed May 26, 2017

View reviewed changes

rzelayafavila reviewed May 30, 2017

View reviewed changes

Adds an image of the database schema.

5e3cf60

mhuyck suggested changes May 31, 2017

View reviewed changes

kurtwheeler mentioned this pull request May 31, 2017

Add convenience to creating a downloader job for a single batch. #14

Closed

rzelayafavila suggested changes Jun 1, 2017

View reviewed changes

Bunches of PR feedback improvements.

1542ddb

mhuyck reviewed Jun 2, 2017

View reviewed changes

run_rabbitmq.sh

@@ -0,0 +1,3 @@

#!/bin/bash

docker run -d --hostname rabbit-queue --name message-queue rabbitmq:3

Copy link

mhuyck Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

mhuyck reviewed Jun 2, 2017

View reviewed changes

workers/data_refinery_workers/downloaders/array_express.py

target_file_path,

batch.id,

job_id)

if success:

Copy link

mhuyck Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

mhuyck reviewed Jun 2, 2017

View reviewed changes

mhuyck approved these changes Jun 2, 2017

View reviewed changes

Fix weird None-check and lots of multi-line docstrings.

80d450b

rzelayafavila approved these changes Jun 2, 2017

View reviewed changes

kurtwheeler merged commit ce380e5 into master Jun 2, 2017

kurtwheeler deleted the improve-metadata branch June 2, 2017 20:02

		from data_refinery_models.models.base_models import TimeTrackedModel


		class SurveyJob(TimeTrackedModel):

		from data_refinery_models.models.surveys import SurveyJob


		class BatchStatuses(Enum):

		from data_refinery_models.models.batches import Batch


		class ProcessorJob(TimeTrackedModel):

		from django.utils import timezone


		class TimeTrackedModel(models.Model):

		@@ -0,0 +1,3 @@
		#!/bin/bash

		docker run -d --hostname my-rabbit --name some-rabbit rabbitmq:3


		ENV PATH "/opt/conda/envs/workers/bin:$PATH"

		COPY data_models/dist/data-refinery-models-* data_models/

		@@ -0,0 +1,3 @@
		#!/bin/bash

		docker run -d --hostname rabbit-queue --name message-queue rabbitmq:3

Improve metadata #12

Improve metadata #12

Conversation

kurtwheeler commented May 10, 2017

mhuyck commented May 23, 2017

dongbohu left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongbohu left a comment

Choose a reason for hiding this comment

dongbohu left a comment

Choose a reason for hiding this comment

rzelayafavila commented May 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rzelayafavila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhuyck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rzelayafavila left a comment

Choose a reason for hiding this comment

rzelayafavila Jun 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongbohu left a comment •

edited

Loading

rzelayafavila Jun 1, 2017 •

edited

Loading