Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update topic parameter to be the topic path. #49

Merged
merged 1 commit into from Apr 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 6 additions & 8 deletions README.md
Expand Up @@ -4,12 +4,11 @@ What's a [jig](https://en.wikipedia.org/wiki/Jig_(tool))?

Run the `init.sh` script to download + compile `trec_eval` and download the appropriate topics + qrels.

To test the jig with an Anserini image, try:
To test the jig with an Anserini image using default parameters, try:

```
python run.py prepare \
--repo osirrc2019/anserini \
--tag latest \
--collections [name]=[path]=[format] ...
```

Expand All @@ -18,22 +17,20 @@ then
```
python run.py search \
--repo osirrc2019/anserini \
--tag latest \
--collection [name] \
--topic [topic_file_name] \
--topic topics/[topic] \
--output /path/to/output \
--qrels $(pwd)/qrels/[qrels]
--qrels qrels/[qrels]
```

Change:
- `[name]` and `[path]` to the collection name and path on the host, respectively
- `[format]` is one of `trectext`, `trecweb`, `json`, or `warc`
- `[topic_file_name]` to the name of the topic file
- `[topic]` to the path of the topic file
- `/path/to/output` to the desired output directory.
- `[qrels]` to the appropriate qrels file

The output run files will appear in the argument of `--output`.
Note that `topic` is just the name of the file from the `topics` dir.
The full command line parameters are below.

## Command Line Options
Expand Down Expand Up @@ -62,7 +59,8 @@ Options with `none` as the default are required.
| `--tag` | `string` | `latest` | `--latest` | the tag on Docker Hub
| `--collection` | `string` | `none` | `--collection robust04` | the collections to index
| `--save_id` | `string` | `save` | `--save_id robust04-exp1` | the ID of the intermediate image
| `--topic` | `string` | `none` | `--topic topics.robust04.301-450.601-700.txt` | the name (not path) of the topic file
| `--topic` | `string` | `none` | `--topic topics/topics.robust04.301-450.601-700.txt` | the path of the topic file
| `--topic_format` | `string` | `trec` | `--topic_format trec` | the format of the topic file
| `--top_k` | `int` | `1000` | `--top_k 500` | the number of results for top-k retrieval
| `--output` | `string` | `none` | `--output $(pwd)/output` | the output path for run files
| `--qrels` | `string` | `none` | `--qrels $(pwd)/qrels/qrels.robust2004.txt` | the qrels file for evaluation
Expand Down
2 changes: 1 addition & 1 deletion run.py
Expand Up @@ -25,7 +25,7 @@
parser_search.add_argument("--save_id", default="save", type=str, help="the ID of the saved image (to search from)")
parser_search.add_argument("--collection", required=True, help="the name of the collection")
parser_search.add_argument("--topic", required=True, type=str, help="the topic file for search")
parser_search.add_argument("--topic_format", default="TREC", type=str, help="the topic file format for search")
parser_search.add_argument("--topic_format", default="trec", type=str, help="the topic file format for search")
parser_search.add_argument("--top_k", default=1000, type=int, help="the number of results for top-k retrieval")
parser_search.add_argument("--output", required=True, type=str, help="the output directory for run files on the host")
parser_search.add_argument("--qrels", required=True, type=str, help="the qrels file for evaluation")
Expand Down
2 changes: 1 addition & 1 deletion searcher.py
Expand Up @@ -39,7 +39,7 @@ def search(self, client, output_path_guest, topic_path_host, topic_path_guest, g
},
"opts": {key: value for (key, value) in map(lambda x: x.split("="), self.config.opts)},
"topic": {
"path": os.path.join(topic_path_guest, self.config.topic),
"path": os.path.join(topic_path_guest, os.path.basename(self.config.topic)),
"format": self.config.topic_format
},
"top_k": self.config.top_k
Expand Down