# Notebook 9: Getting lines from just one speaker
### Kasper Fyhn Jacobsen

Once again, I was in the pleasant situation where my childes_transcripts module and namely the `Transcript` class was already able to do this. However, since I had not used it that much yet myself, this made an ideal situation to improve it. As always, the module can be seen [here](https://github.com/KasperFyhn/ChildLangAcqui/blob/master/src/childes_transcripts.py) on GitHub.

## Improving the class

Before, I had defined an instance method to return all non-header lines as tuples with `tuple[0]` being the speaker or comment/annotation-type and `tuple[1]` being the actual line. In the class constructur, i.e. the `__init__()` method, an instance variable `lines` (now changed to `raw_lines`) is created; this variable is a list of all raw lines except headers. The method simply tweaked each line into a tuple as described and returned it, like this:

```python
def lines_as_tuples(self):
    '''Return a list of tuples of all utterance lines, where tuple[0] is
    the three letter initials for the speaker and tuple[1] is the line.'''

    tuples = [(line[1:4], line[5:]) for line in self.lines]

    return tuples
```

The `tokens()` method relied on this method. So, I did two things: 1) I made it possible to call for lines for just one or more specified speakers and 2) I removed the "speaker missing" warning from the `tokens()` method and put it to be done in this `lines_as_tuples()` method. Then, it looked like this.

```python
def lines_as_tuples(self, speakers='all'):
    '''Return a list of tuples of all utterance lines, where tuple[0] is
    the three letter initials for the speaker and tuple[1] is the line. One
    or more speakers can be specified to retrieve just lines by these.'''

    tuples = [(line[1:4], line[5:]) for line in self.lines]
    
    # sort the list to contain only the specified speakers
    if not speakers == 'all':
        if type(speakers) == str:
            speakers = [speakers] 
        tuples = [line for line in tuples if line[0] in speakers]

        # check if the requested speakers are present in the transcript
        # and report if they are not
        for speaker in speakers:
            if speaker not in self.speakers():
                print(f'WARNING: The speaker {speaker} is not present ' +
                      f'in the transcript {self.name}.')

    return tuples
```

And I thought I was finished. But, then, I remembered the annotations which I figured might be needed in some situations. Just getting these out was easy enough: just put the name of the annotation tier in the list of requested speakers. However, this would give ALL of the annotations and not just the ones for the lines that we requested. It would also give a few warning messages. So I moved on to making this possible and ended up making it like this, a now more versatile method:

```python
def lines_as_tuples(self, speakers='all', morphosyntax=False,
                    grammar=False, actions=False):
    '''Return a list of tuples of all utterance lines, where tuple[0] is
    the three letter initials for the speaker and tuple[1] is the line. One
    or more speakers can be specified to retrieve just lines by these and 
    one or more flags can be marked to get annotations for the requested
    speaker(s).'''

    if speakers == 'all':
        speakers = self.speakers()
    if type(speakers) == str: # make sure that it is a list
        speakers = [speakers]

    # check if the requested speakers are present in the transcript
    # and report if they are not
    for speaker in speakers:
        if speaker not in self.speakers():
            print(f'WARNING: The speaker {speaker} is not present ' +
                  f'in the transcript {self.name}.')

    if morphosyntax == True:
        speakers.append('mor')
    if grammar == True:
        speakers.append('gra')
    if actions == True:
        speakers.append('act')

    # make list with lines as three part tuples
    tuples = [(line[0], line[1:4], line[5:]) for line in self.lines]

    # divide into blocks of turns with their annotations
    blocks = []
    for line in tuples:
        if line[0] == '*':
            blocks.append([])
            blocks[-1] = [(line[1], line[2])]
        elif line[0] == '%':
            blocks[-1].append((line[1], line[2]))

    # put together the blocks of the requested speakers along with the
    # requested annotations
    tuples = []
    for block in blocks:
        if block[0][0] in speakers:
            tuples += [line for line in block if line[0] in speakers]

    return tuples
```

## Seeing it in action - a few examples

In [11]:
import os
os.chdir(r'C:\Users\Kasper Fyhn Jacobsen\python-projects\ChildLangAcqui\src')
import childes_transcripts as ts

transcript = ts.Transcript(r'C:\Users\Kasper Fyhn Jacobsen\python-projects\ChildLangAcqui\data\Kuczaj\020606.cha')

# spoken lines by all speakers
transcript.lines_as_tuples()

[('CHI', 'Momma (.) draw Momma (.) draw please draw .'),
 ('MOT', 'Abe (.) I have_to eat breakfast before I can draw .'),
 ('MOT', 'here (.) you draw .'),
 ('CHI', 'this is broken (.) Mom .'),
 ('MOT', "what's broke ?"),
 ('CHI', 'this is broke .'),
 ('MOT', 'draw with another one „ okay ?'),
 ('CHI', "that's not paper ."),
 ('MOT', "that isn't paper ."),
 ('CHI', "uhuh that's not paper in here ."),
 ('CHI', "Momma (.) I don't want this picture ."),
 ('MOT', 'how come ?'),
 ('CHI', 'huh .'),
 ('MOT', "how come you don't want that picture ?"),
 ('MOT', 'you drew it .'),
 ('CHI', "it's broken ."),
 ('MOT', "oh ‡ the paper's ripped ."),
 ('MOT', 'here (.) draw on this .'),
 ('CHI', 'this is a picture .'),
 ('CHI', 'this is not paper .'),
 ('MOT', "that's orange paper ."),
 ('CHI', "oh ‡ we didn't get orange paper (.) Mom ."),
 ('MOT', 'go ahead .'),
 ('MOT', 'you can draw on it .'),
 ('CHI', 'here (.) Mom see eyes .'),
 ('CHI', 'see the eyes .'),
 ('MOT', 'oh ‡ Abe made two red eyes (.) D

In [12]:
# morphosyntactically annotated lines by the father
transcript.lines_as_tuples(speakers='FAT', morphosyntax=True)

[('FAT', 'wow !'),
 ('mor', 'co|wow !'),
 ('FAT', 'okay ‡ give it here .'),
 ('mor', 'co|okay beg|beg v|give pro:per|it adv|here .'),
 ('FAT', 'we might need it again though .'),
 ('mor', 'pro:sub|we mod|might v|need pro:per|it adv|again adv|though .'),
 ('FAT', 'is somebody going to climb it ?'),
 ('mor', 'aux|be&3S pro:indef|somebody part|go-PRESP inf|to v|climb'),
 ('FAT', "that's a neat ladder (.) Abe ."),
 ('mor', 'pro:dem|that~cop|be&3S det:art|a adj|neat n|ladder n:prop|Abe .'),
 ('FAT', "it's not for me ."),
 ('mor', 'pro:per|it~cop|be&3S neg|not prep|for pro:obj|me .'),
 ('FAT', "who's it for ?"),
 ('mor', 'pro:rel|who~aux|be&3S pro:per|it prep|for ?'),
 ('FAT', "oh ‡ it's beautiful (.) Abe ."),
 ('mor', 'co|oh beg|beg pro:per|it~cop|be&3S adj|beautiful n:prop|Abe .'),
 ('FAT', 'I like it .'),
 ('mor', 'pro:sub|I v|like pro:per|it .'),
 ('FAT', 'I like it .'),
 ('mor', 'pro:sub|I v|like pro:per|it .'),
 ('FAT', 'I do too .'),
 ('mor', 'pro:sub|I v|do post|too .'),
 ('FAT', 'uh

In [13]:
# grammar and actions annotated lines by the mother and the child (excluding the father)
transcript.lines_as_tuples(speakers=['MOT', 'CHI'], morphosyntax=True, actions=True)

[('CHI', 'Momma (.) draw Momma (.) draw please draw .'),
 ('mor', 'n:prop|Momma v|draw n:prop|Momma v|draw co|please v|draw .'),
 ('MOT', 'Abe (.) I have_to eat breakfast before I can draw .'),
 ('mor', 'n:prop|Abe pro:sub|I mod:aux|have_to v|eat n|breakfast conj|before'),
 ('MOT', 'here (.) you draw .'),
 ('mor', 'adv|here pro:per|you v|draw .'),
 ('CHI', 'this is broken (.) Mom .'),
 ('mor', 'pro:dem|this aux|be&3S part|break&PASTP n:prop|Mom .'),
 ('MOT', "what's broke ?"),
 ('mor', 'pro:int|what~aux|be&3S adj|broke ?'),
 ('CHI', 'this is broke .'),
 ('mor', 'pro:dem|this cop|be&3S adj|broke .'),
 ('MOT', 'draw with another one „ okay ?'),
 ('mor', 'v|draw prep|with qn|another pro:indef|one end|end co|okay ?'),
 ('CHI', "that's not paper ."),
 ('mor', 'pro:dem|that~cop|be&3S neg|not n|paper .'),
 ('act', 'draws on the table'),
 ('MOT', "that isn't paper ."),
 ('mor', 'pro:dem|that cop|be&3S~neg|not n|paper .'),
 ('CHI', "uhuh that's not paper in here ."),
 ('mor', 'co|uhuh pro:dem|t