Skip to content

Commit

Permalink
Release v0.1.0 - updated version and README
Browse files Browse the repository at this point in the history
  • Loading branch information
hlgirard committed Apr 4, 2019
1 parent 52d5ae3 commit 0251a9a
Show file tree
Hide file tree
Showing 6 changed files with 69 additions and 55 deletions.
68 changes: 39 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

Graphical tool to manually label images in distinct categories to build training datasets.
Simply pass a list of categories, a directory containing images and start labelling.
Supports keyboard bindings to label even faster!
Supports multiple users, reconciliation and keyboard bindings to label even faster!

![screenshot](docs/screenshot_190228.png)
![screenshot](docs/screenshot_190404.png)

## Installation

Expand Down Expand Up @@ -34,16 +34,45 @@ pip install .

## Usage

### Command line tools
### Quick start

Pass the categories and image directory on the command line to start labelling. Use the on-screen buttons to select a label for the current image and advance to the next one. Number keys correspond to labels and can be used instead. A 'remove' label is automatically added to the list of passed categories.
Pass the labels and image directory on the command line to start labelling. Use the on-screen buttons to select a label for the current image and advance to the next one. Number keys correspond to labels and can be used instead.

```
simplabel --categories dog cat bird --directory path/to/image/directory
simplabel --labels dog cat bird --directory path/to/image/directory
```

After the first use, labels are stored in 'labels.pkl' and there is no need to pass the '--categories' argument unless you want to add labels.
You can also use '--reset' to delete the saved labels and dictionary from the directory before execution.
After the first use, labels are stored in `labels.pkl` and the `--labels` argument is ignored.

### Command line arguments

- `-d, --directory <PATH/TO/DIRECTORY>` sets the directory to search for images and save labels to. Defaults to the current working directory.
- `-l, --labels <label1 label2 label3 ...>` sets the categories for the labelling task. Only passed on the first use in a given directory.
- `-u, --user <USERNAME>` sets the username. Defaults to the OS login name if none is passed.
- `-r, --redundant` does not display other labelers selections for independent labelling. Reconciliation and Make Master are unavailable in this mode.
- `-v, --verbose` increases the verbosity level.
- `--remove-label` tries to safely remove a label from the list saved in `labels.pkl`.
- `--reset-lock` overrides the lock preventing the same username from being used multiple times simultaneously.
- `--delete-all` removes all files created by simplabel in the directory

### Multiuser

The app relies on the filesystem to save each user's selection and display other user's selections. It works best if the working directory is on a shared drive or in a synced folder (Dropbox, Onedrive...). The Reconcile workflow allows any user to see and resolve conflicts. The Make Master option can be used to create and save a master dictionary - `labeled_master.pkl` - containing all labeled images (after reconciliation).

### Import saved labels

The app saves a `labeled_<username>.pkl` file that contains a pickeled dictionary {image_name: label}. To import the dictionary, use the following sample code:

```python
import pickle

with open("labeled_user1.pkl","rb") as f:
label_dict = pickle.load(f)
```

## Advanced usage

### Utilities

Once you are done labelling, use the flow_to_directory tool to copy images to distinct directories by label

Expand All @@ -53,6 +82,8 @@ flow_to_directory --rawDirectory data/raw --outDirectory data/labeled

### Python object

The Tkinter app can also be started from a python environment

```python
from simplabel import ImageClassifier
import tkinter as tk
Expand All @@ -62,25 +93,4 @@ directory = "data/raw"
categories = ['dog', 'cat', 'bird']
MyApp = ImageClassifier(root, directory, categories)
tk.mainloop()
```

### Saved labels

The app saves a labeled.pkl file that contains a pickeled dictionary {image_name: label}. To import the dictionary, use the following sample code:

```python
import pickle

with open("labeled.pkl","rb") as f:
label_dict = pickle.load(f)
```

### Move labeled images to discrete directories

This utility copies labeled images from the raw directory to discrete folders by label in the labelled directory using the dictionary created by simplabel.

```python
from simplabel import utils

utils.flow_to_dict(rawDirectory, labelledDirectory)
```
```
6 changes: 3 additions & 3 deletions bin/simplabel
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ import simplabel
ap = argparse.ArgumentParser()
ap.add_argument("-d", "--directory", default=os.getcwd(), help="Path of the directory")
ap.add_argument("-l", "--labels", nargs='+', default=None, help="List of labels")
ap.add_argument("-v", "--verbose", action='store_true', help="Enable verbose mode")
ap.add_argument("-v", "--verbose", action='count', default=0, help="Enable verbose mode")
ap.add_argument("-u", "--user", help="Set username for the current session")
ap.add_argument("-r", "--redundant", action='store_true', help="Redundant mode: do not show other labeler's selections")
ap.add_argument("--delete-all", action='store_true', help="Deletes all files created by simplabel in a directory, this resets the labels and all saved data")
ap.add_argument("--reset-lock", action='store_true', help="Overrides the lock in case of incorrect lockout")
ap.add_argument("--remove-label", help="Remove a label from the list")
ap.add_argument("--redundant", action='store_true', help="Redundant mode: do not show other labeler's selections")



Expand All @@ -26,7 +26,7 @@ args = ap.parse_args()
# Get the variables from parser
rawDirectory = args.directory
categories = args.labels
verbosity = 1 if args.verbose else 0
verbosity = args.verbose
username = args.user
bResetLock = args.reset_lock
bRedundant = args.redundant
Expand Down
Binary file added docs/screenshot_190404.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
long_description = f.read()

setup(name='simplabel',
version='0.0.4',
version='0.1.0',
description='Simple tool to manually label images in disctinct categories to build training datasets.',
long_description=long_description,
long_description_content_type="text/markdown",
Expand Down
4 changes: 3 additions & 1 deletion simplabel/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
from .simplabel import *
from .simplabel import *

__version__ = '0.1.0'
44 changes: 23 additions & 21 deletions simplabel/simplabel.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,12 @@ class ImageClassifier(tk.Frame):
Interval in seconds between auto-save and auto-refresh of master dict actions (0 to disable)
bResetLock: bool
When true, ignores and resets the lock that prevents multiple users from using the same username
bRedundant: bool
When true, other labeler's selections are not displayed. Reconcile and Master are not available in this mode.
Notable attributes
Notable outputs
-------
labelled : dict(string: string)
labelled_user.pkl : pickled dict(string: string)
Dictionary containing the labels in the form {'relative/path/image_name.jpg': label}
This dict is saved to disk by the 'Save' button
"""
Expand All @@ -46,10 +48,9 @@ def __init__(self, parent, directory, categories = None, verbose = 0, username =
tk.Frame.__init__(self, parent, *args, **kwargs)

# Initialize logger
verbose = 2 # FIXME: verbosity is set to debug level
if verbose == 1:
logging.basicConfig(level=logging.INFO, format='%(levelname)s - %(message)s')
elif verbose == 2:
elif verbose >= 2:
logging.basicConfig(level=logging.DEBUG, format='%(levelname)s - %(message)s')
else:
logging.basicConfig(level=logging.WARNING, format='%(levelname)s - %(message)s')
Expand Down Expand Up @@ -94,8 +95,9 @@ def __init__(self, parent, directory, categories = None, verbose = 0, username =
logging.info("Existing users: {}".format(self.users))

# Assign a color for each user
# TODO: Rewrite to ensure each user has a separate color if possible
self.userColors = {user: self.user_color_helper(user) for user in self.users}
self.userColors = {}
for user in self.users:
self.userColors[user] = self.user_color_helper(user)
self.userColors['master'] = '#3E4149'

# Set the username for the current session
Expand Down Expand Up @@ -146,7 +148,6 @@ def __init__(self, parent, directory, categories = None, verbose = 0, username =
self.gotLock = True

# Directory containing the saved labeled dictionary
## Note: username will be "guest" if none was passed as command line argument
self.savepath = self.folder + "/labeled_" + self.username +".pkl"

# Initialize UI
Expand Down Expand Up @@ -260,7 +261,7 @@ def initialize_labels(self):

# If no file and no categories passed, warn and exit
else:
logging.warning("No labels provided. Exiting.")
logging.warning("No labels provided. Use '-l label1 label2 ...' to add them. Exiting.")
self.errorClose()

def initialize_data(self):
Expand Down Expand Up @@ -573,34 +574,36 @@ def display_image(self):
self.catButton[i].config(highlightbackground = self.buttonOrigColor)

# Display the associated label(s) from any user as colored background for the label button
## If in reconcileMode, display the chosen label in grey
if self.reconciledLabelsDict and img in self.reconciledLabelsDict:
label = self.reconciledLabelsDict[img]
idxLabel = self.categories.index(label)
self.catButton[idxLabel].config(highlightbackground='#3E4149')
elif img in self.allLabeledDict:
else:
labelDict = {}
for (user, label) in self.allLabeledDict[img].items():
if label in labelDict:
labelDict[label].append(self.userColors[user])
else:
labelDict[label] = [self.userColors[user]]
# The img might be in self.labeled but not yet in self.allLabeledDict (between updates of allLabeledDict)
## In normal mode, check allLabeledDict for other user's labels
if img in self.allLabeledDict:
for (user, label) in self.allLabeledDict[img].items():
### Current user's data might not be up to date in allLabeledDict, will user self.labeled
if user != self.username:
if label in labelDict:
labelDict[label].append(self.userColors[user])
else:
labelDict[label] = [self.userColors[user]]
## Get curent user's label from self.labeled
if img in self.labeled:
label = self.labeled[img]
if label in labelDict and self.userColor not in labelDict[label]:
labelDict[label].append(self.userColor)
elif label not in labelDict:
labelDict[label] = [self.userColor]
## Finally, change the button color accordingly
for label in labelDict:
idxLabel = self.categories.index(label)
if len(labelDict[label]) == 1:
self.catButton[idxLabel].config(highlightbackground=labelDict[label][0])
else:
self.catButton[idxLabel].config(highlightbackground='#3E4149')
elif img in self.labeled:
label = self.labeled[img]
idxLabel = self.categories.index(label)
self.catButton[idxLabel].config(highlightbackground=self.userColor)

# Disable back button if on first image
if self.counter == 0:
Expand Down Expand Up @@ -762,8 +765,7 @@ def goto_next_unlabeled(self):
self.display_image()

def sort_conflicting_imgs(self):
'''Returns a sub-lists of images: (labeledAgreed, labeledDisagreed, toLabel)'''
# TODO: speed up this method ?
'''Returns sub-lists of images: (labeledAgreed, labeledDisagreed, toLabel)'''

labeledAgreed = []
labeledDisagreed = []
Expand Down

0 comments on commit 0251a9a

Please sign in to comment.