Skip to content

Commit

Permalink
change README.md to match index.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
NickleDave committed Dec 26, 2018
1 parent 0dfc49b commit 03d144d
Showing 1 changed file with 57 additions and 8 deletions.
65 changes: 57 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,71 @@
[![Build Status](https://travis-ci.com/NickleDave/crowsetta.svg?branch=master)](https://travis-ci.com/NickleDave/crowsetta)
[![Documentation Status](https://readthedocs.org/projects/crowsetta/badge/?version=latest)](https://crowsetta.readthedocs.io/en/latest/?badge=latest)

A tool to work with any format for annotating birdsong.
**The goal of** `crowsetta` **is to make sure that your ability to work with a
birdsong dataset does not depend on your ability to work with any given format for
annotating that dataset.**
`crowsetta` is a tool to work with any format for annotating birdsong.
**The goal of** `crowsetta` **is to make sure that your ability to work with a dataset
of birdsong does not depend on your ability to work with any given format for
annotating that dataset.** What `crowsetta` gives you is **not** yet another format for
annotation (I promise!); instead you get some nice data types that make it easy to
work with any format: namely, `Sequence`s made up of `Segment`s.

```Python
>>> from crowsetta import Segment, Sequence
>>> a_segment = Segment.from_keyword(
... label='a',
... onset_Hz=16000,
... offset_Hz=32000,
... file='bird21.wav'
... )
>>> list_of_segments = [a_segment] * 3
>>> seq = Sequence(segments=list_of_segments)
>>> print(seq)
Sequence(segments=[Segment(label='a', onset_s=None, offset_s=None, onset_Hz=16000,
offset_Hz=32000, file='bird21.wav'), Segment(label='a', onset_s=None, offset_s=None,
onset_Hz=16000, offset_Hz=32000, file='bird21.wav'), Segment(label='a', onset_s=None,
offset_s=None, onset_Hz=16000, offset_Hz=32000, file='bird21.wav')])
```

You can load annotation from your format of choice into `Sequence`s of `Segment`s
(most conveniently with the `Transcriber`, as explained below) and then use the
`Sequence`s however you need to in your program.

For example, if you want to loop through the `Segment`s of each `Sequence`s to
pull syllables out of a spectrogram, you can do something like this, very Pythonically:

```Python
>>> syllables_from_sequences = []
>>> for a_seq in seq:
... seq_dict = seq.to_dict() # convert to dict with
... spect = some_spectrogram_making_function(seq['file'])
... syllables = []
... for seg in seq.segments:
... syllable = spect[:, seg.onset:seg.offset] ## spectrogram is a 2d numpy array
... syllables.append(syllable)
... syllables_from_sequences.append(syllables)
```

As mentioned above, `crowsetta` provides you with a `Transcriber` that comes equipped
with convenience functions to do the work of converting for you.

```Python
from crowsetta import Transcriber
scribe = Transcriber()
seq = scribe.to_seq(file=notmat_files, format='notmat')
```

You can even easily adapt the `Transcriber` to use your own in-house format, like so:

```python
```Python
from crowsetta import Transcriber
scribe = Transcriber(user_config=your_config)
scribe = Transciber(user_config=your_config)
scribe.to_csv(file_'your_annotation_file.mat',
csv_filename='your_annotation.csv')
```

## Features

- convert annotation formats to ``Sequence`` objects that can be easily used in a Python program
- convert ``Sequence`` objects to comma-separated value text files that can be read on any system
- convert annotation formats to `Sequence` objects that can be easily used in a Python program
- convert `Sequence` objects to comma-separated value text files that can be read on any system
- load comma-separated values files back into Python and convert to other formats
- easily use with your own annotation format

Expand Down

0 comments on commit 03d144d

Please sign in to comment.