In [1]:
import sys; sys.path.append(_dh[0].split("knowknow")[0])
from knowknow import *

In [2]:
showdocs("counter")

# Counting coocurrences

Cultural phenomena are rich in meaning and context. Moreover, the meaning and context are what we care about, so stripping that would be a disservice. "Consider Geertz:"
> Not only is the semantic structure of the figure a good deal more complex than it appears on the surface, but an analysis of that structure forces one into tracing a multiplicity of referential connections between it and social reality, so that the final picture is one of a configuration of dissimilar meanings out of whose interworking both the expressive power and the rhetorical force of the final symbol derive. (Geertz [1955] 1973, Chapter 8 Ideology as a Cultural System, p. 213)

The way people understanding their world shape their action, and understandings are heterogeneous in any community, woven into a complex web of interacting pieces and parts. Understandings are constantly evolving, shifting with every conversation or Breaking News. Any quantitative technique for studying meaning must be able to capture the relational structure of cultural objects, their temporal dynamics, or it cannot be meaning.

These considerations motivate how I have designed the data structure and code for this project. My attention to "cooccurrences" in what follows is an application of Levi Martin and Lee's (2018) formal approach to meaning. They develop the symbolic formalism I use below, as well as showing several general analytic strategies for inductive, ground-up meaning-making from count data. This approach is quite general, useful for many applications.

The process is rather simple, I count cooccurrences between various attributes. For each document, for each citation in that document, I increment a dozen counters, depending on attributes of the citation, paper, journal, or author. This counting process is done once, and can be used as a compressed form of the dataset for all further analyses. In the terminology of Levi Martin and Lee, I am constructing "hypergraphs", and I will use their notation in what follows. For example $[c*fy]$ indicates the dataset which maps from $(c, fy) \to count$.
$c$ is the name of the cited work. $fy$ is the publication year of the article which made the citation. $count$ is the number of citations which are at the intersection of these properties.

+ $[c]$ the number of citations each document receives
+ $[c*fj]$ the number of citations each document receives from each journal's articles
+ $[c*fy]$ the number of citations each document receives from each year's articles
+ $[fj]$ the number of citations from each journal
+ $[fj*fy]$ the number of citations in each journal in each year
+ $[t]$ cited term total counts
+ $[fy*t]$ cited term time series
+ term cooccurrence with citation and journal ($[c*t]$ and [fj*t]$)
+ "author" counts, the number of citations by each author ($[a]$ $[a*c]$ $[a*j*y]$)
+ [c*c]$, the cooccurrence network between citations
+ the death of citations can be studied using the $[c*fy]$ hypergraph
+ $[c*fj*t]$ could be used for analyzing differential associations of $c$ to $t$ across publication venues
+ $[ta*ta]$, $[fa*fa]$, $[t*t]$ and $[c*c]$ open the door to network-scientific methods



# References



+ Martin, John Levi, and Monica Lee. 2018. “A Formal Approach to Meaning.” Poetics 68(February):10–17.
+ Geertz, Clifford. 1973. The Interpretation of Cultures. New York: Basic Books, Inc.

# README

First, you need to get some data. In accordance with JSTOR's usage policies, I **do not provide any full-text data**. And that's the data you need to use this notebook.
You can obtain your own data by requesting full OCR data packages through JSTOR's [Data for Research](https://www.jstor.org/dfr/) initiative. 

Make sure to read carefully through "User Settings." Set the appropriate settings, and run the entire notebook.

This will create a new "database" of counts, which can be recalled by running `my_counts = get_cnt( '<DB_NAME_HERE>' )`.

# User Settings

`database_name` is the name you choose for the final dataset of counts

`zipdir` is the directory which contains the `.zip` files JSTOR provides to you (not included)

`mode` choose between "basic" and "all" mode

1. "basic" mode
    + this mode is not typically faster than `everything`, but it does reduce RAM overhead
        + on ~200k articles the running counters take up more than 16GB RAM
        + to counter this, I first run simple statistics, then rerun this notebook again, filtering based on the descriptive statistics
    + includes `c` counts, the number of citations each document receives
    + includes `c.fj` counts, the number of citations each document receives from each journal's articles
    + includes `c.fy` counts, the number of citations each document receives from each year's articles
    + includes `fj` counts, the number of citations from each journal
    + includes `fj.fy` counts, the number of citations in each journal in each year
    + includes `t` `fy.t` counts, for term time series and filtering

2. "all" mode
    + you must run this if you want to run all analyses included in this project
    + includes all counts from `basic` mode
    + includes term cooccurrence with citation and journal (`c.t` `fj.t`)
    + includes "author" counts, the number of citations by each author (`a` `a.c` `a.j.y`)
    + includes `c.c`, the cooccurrence network between citations

In [3]:
database_name = 'sociology-jstor-basicall'
zipdir = 'G:/My Drive/projects/qualitative analysis of literature/pre 5-12-2020/003 process JSTOR output/RaW dAtA/'
mode = 'all'

I use citation and journal filters while counting. 
This filtering is important when working with large datasets. You can run the "trend summaries/cysum" on a `basic` database, and use the variable it automatically generates, `"<DBNAME>.included_citations"` to modify which citations to use when computing the `all` database.

In most cases, it's best to set `use_included_citations_filter` and `use_included_journals_filter` both to `False` the first time you run this notebook on a new dataset.

In [4]:
use_included_citations_filter = True
use_included_journals_filter = True

# not necessary if you're not filtering based on citations and journals pre-count
included_citations = load_variable("sociology-jstor.included_citations")
included_journals = ['Acta Sociologica', 'Administrative Science Quarterly', 'American Journal of Political Science', 'American Journal of Sociology', 'American Sociological Review', 'Annual Review of Sociology', 'BMS: Bulletin of Sociological Methodology / Bulletin de Méthodologie Sociologique', 'Berkeley Journal of Sociology', 'Contemporary Sociology', 'European Sociological Review', 'Hitotsubashi Journal of Social Studies', 'Humboldt Journal of Social Relations', 'International Journal of Sociology', 'International Journal of Sociology of the Family', 'International Review of Modern Sociology', 'Journal for the Scientific Study of Religion', 'Journal of Health and Social Behavior', 'Journal of Marriage and Family', 'Language in Society', 'Michigan Sociological Review', 'Polish Sociological Review', 'Review of Religious Research', 'Social Forces', 'Social Indicators Research', 'Social Problems', 'Social Psychology Quarterly', 'Sociological Bulletin', 'Sociological Focus', 'Sociological Forum', 'Sociological Methodology', 'Sociological Perspectives', 'Sociological Theory', 'Sociology', 'Sociology of Education', 'Sociology of Religion', 'Symbolic Interaction', 'The American Sociologist', 'The British Journal of Sociology', 'The Canadian Journal of Sociology', 'The Sociological Quarterly', 'Theory and Society']

Terms are iteratively pruned. After `CONSOLIDATE_EVERY_N_CITS` citations are counted, the algorithm will keep only the top `NUM_TERMS_TO_KEEP` terms, blacklisting the rest and not counting them anymore. This doesn't hurt the dataset, but dramatically reduces the RAM overhead and the size of the final dataset on disk.

In [5]:
CONSOLIDATE_TERMS = True

NUM_TERMS_TO_KEEP = 5000

CONSOLIDATE_EVERY_N_CITS = NUM_TERMS_TO_KEEP*3
#CONSOLIDATE_EVERY_N_CITS = 1000

NPERYEAR = 300

It's also convenient to be able to rename various entities. There were a few different names for the Canadian Journal of Sociology. If you want to filter on something other than journals, you'll have to modify the code and add this feature.

In [6]:
journal_map = {} # default
journal_map = {
    "Canadian Journal of Sociology / Cahiers canadiens de sociologie": 'The Canadian Journal of Sociology',
    "The Canadian Journal of Sociology / Cahiers canadiens de\n                sociologie": 'The Canadian Journal of Sociology',
    'The Canadian Journal of Sociology / Cahiers canadiens de sociologie': 'The Canadian Journal of Sociology'
}

# imports

In [7]:
# utilities
from nltk import sent_tokenize
from zipfile import ZipFile

import os
import sys
sys.path.insert(0, os.path.abspath('./creating variables/'))

# library functions for cleaning and extracting in-text citations from OCR
from cnt_cooc_jstor_lib import (
    citation_iterator, getOuterParens, 
    Document, ParseError, 
    clean_metadata
)


In [8]:
# getting ready for term counting
from nltk.corpus import stopwords as sw
stopwords = set(sw.words('english'))

In [9]:
zipfiles = list(Path(zipdir).glob("*.zip"))

# helpers

The following helper function `file_iterator` iterates through all documents inside a list of zipfiles

Each iteration returns:
1. the document DOI
2. the metadata file contents
3. the ocr file contents

`get_page_strings` takes the string contents of an XML file produced by JSTOR. The XML file in question represents the text of a given article. This function cleans the text for OCR peculiarities, and splits the document into pages for further processing.

`consolidate terms` was built to eliminate all terms which are not in the top `NUM_TERMS_TO_KEEP`.
This is done by sorting `fromyear-term`, or `fy.t` counts in descending order. The top entry here is the term-year pair which accumulated the most appearances in citation contexts. I take the top 1000 `t`'s in this sorted list and preserve them, and blacklist the rest.

In [12]:
term_whitelist = set()

def consolidate_terms():
    global term_whitelist, CONSOLIDATION_CUTOFF
    

    have_now = set(cnt_doc['t'])
    # this is where the filtering occurs
    
    to_keep = set()
    if True:
        
        # takes terms based on the maximum number I can take...
        terms = list(cnt_doc['t'].keys())
        counts = np.array([cnt_doc['t'][k] for k in terms])
        argst = list(reversed(np.argsort(counts)))
        
        to_keep = [terms[i] for i in argst if '-' in terms[i][0]][:NUM_TERMS_TO_KEEP//2] # half should be 2-tuples
        to_keep += [terms[i] for i in argst if not '-' in terms[i][0]][:NUM_TERMS_TO_KEEP//2] # half should be 1-tuples
        
        to_remove = have_now.difference(to_keep)
        to_remove = set("-".join(x) for x in to_remove)
            
    
    if False:
        # takes the top 5000 terms in terms of yearly count
        sort_them = sorted(cnt_doc['fy.t'], key=lambda x: -cnt_doc['fy.t'][x])
        to_keep = defaultdict(set)
        
        i = 0
        while not len(to_keep) or (
            min(len(x) for x in to_keep.values()) < NPERYEAR and 
            i < len(sort_them)
        ):
            # adds the term to the year set, if it's not already "full"
            me = sort_them[i]
            me_fy, me_t = me
            
            # eventually, we don't count it :P
            if cnt_doc['t'][me_t] < CONSOLIDATION_CUTOFF:
                break
            
            if len(to_keep[me_fy]) < NPERYEAR:
                to_keep[me_fy].add(me_t) 
            i += 1
            
        if False: # useful for debugging
            print({
                k: len(v)
                for k,v in to_keep.items()
            })
            
        to_keep = set(chain.from_iterable(x for x in to_keep.values()))
        to_remove = have_now.difference(to_keep)
    
    
    # so that we never log counts for these again:
    term_whitelist.update([x[0] for x in to_keep])

    # the rest of the code is pruning all other term counts for this term in memory
    print("consolidating... removing", len(to_remove), 'e.g.', sample(to_remove,5))
    
    to_prune = ['t','fy.t','fj.t','c.t']
    for tp in to_prune:
        
        whichT = tp.split(".").index('t') # this checks where 't' is in the name of the variable (first or second?)

        print("pruning '%s'..." % tp)

        tydels = [x for x in cnt_doc[tp] if x[ whichT ] in to_remove]
            
        print("old size:", len(cnt_doc[tp]))
        for tr in tydels:
            del cnt_doc[tp][tr]
            del cnt_ind[tp][tr]
        print("new size:", len(cnt_doc[tp]))
        
    
    print("final terms: ", ", ".join( sample(list("-".join(list(x)) for x in cnt_doc['t']), 200) ))

# Counting algorithm

The following cells contain the counting function, which accounts for a document in various ways.
This function should be relatively simple to extend, if you want to count other combinations, or different attributes altogether.

In [13]:
cnt_ind = defaultdict(lambda:defaultdict(int))
track_doc = defaultdict(lambda:defaultdict(set))
cnt_doc = defaultdict(lambda:defaultdict(int))

def cnt(term, space, doc):
    # it's a set, yo
    track_doc[space][term].add(doc)
    # update cnt_doc
    cnt_doc[space][term] = len(track_doc[space][term])
    # update ind count
    cnt_ind[space][term] += 1

In [14]:
cits = 0
last_print = 0
citations_skipped = 0

def account_for(doc):
    global cits, last_print, mode, citations_skipped
    
    # consolidating "terms" counter as I go, to limit RAM overhead
    # I'm only interested in the most common 1000
    if CONSOLIDATE_TERMS and \
            not len(term_whitelist) and \
            cits - last_print > CONSOLIDATE_EVERY_N_CITS:
        print("Citation %s" % cits)
        print("Term %s" % len(cnt_doc['t']))
        #print(sample(list(cnt_doc['t']), 10))
        last_print = cits
        consolidate_terms()


    if 'citations' not in doc or not len(doc['citations']):
        #print("No citations", doc['doi'])
        return

    for c in doc['citations']:
        if 'contextPure' not in c:
            raise Exception("no contextPure...")



        for cited in c['citations']:
            
            if use_included_citations_filter and (cited not in included_citations):
                citations_skipped += 1
                continue
            
            cits += 1
            cnt(doc['year'], 'fy', doc['doi'])

            # citation
            cnt(cited, 'c', doc['doi'])

            # journal
            cnt(doc['journal'], 'fj', doc['doi'])

            # journal year
            cnt((doc['journal'], doc['year']), 'fj.fy', doc['doi'])

            # citation journal
            cnt((cited, doc['journal']), 'c.fj', doc['doi'])

            # citation year
            cnt((cited, doc['year']), 'c.fy', doc['doi'])

            
        # constructing the tuples set :)
        sp = c['contextPure'].lower()
        sp = re.sub("[^a-zA-Z\s]+", "", sp) # removing extraneous characters
        sp = re.sub("\s+", " ", sp) # removing extra characters
        sp = sp.strip()
        sp = sp.split() # splitting into words
        
        sp = [x for x in sp if x not in stopwords] # strip stopwords
        
        if False:
            tups = set(zip(sp[:-1], sp[1:])) # two-word tuples
        elif False:
            tups = set( (t1,t2) for t1 in sp for t2 in sp if t1!=t2 )# every two-word pair :)
        else:
            
            tups = set( "-".join(sorted(x)) for x in set(zip(sp[:-1], sp[1:]))) # two-word tuples
            tups.update( sp ) # one-word tuples
            
        #print(len(tups),c['contextPure'], "---", tups)
        
        if len(term_whitelist):
            tups = [x for x in tups if x in term_whitelist]

        # just term count, in case we are using the `basic` mode
        for t1 in tups:
            # term
            cnt((t1,), 't', doc['doi'])

            # term year
            cnt((doc['year'], t1), 'fy.t', doc['doi'])
            
        
        if mode == 'all':


            for cited in c['citations']:
                
                if use_included_citations_filter and (cited not in included_citations):
                    continue
                    
                # term features
                for t1 in tups:
                    
                    # cited work, tuple
                    cnt((cited, t1), 'c.t', doc['doi'])

                    # term journal
                    cnt((doc['journal'], t1), 'fj.t', doc['doi'])

                    if False: # eliminating data I'm not using

                        # author loop
                        for a in doc['authors']:
                            # term author
                            cnt((a, t1), 'fa.t', doc['doi'])
                            
                    if len(term_whitelist): # really don't want to do this too early. wait until it's narrowed down to the 5k
                        # term term...
                        for t2 in tups:
                            # if they intersect each other, continue...
                            if len(set(t1).intersection(set(t2))) >= min(len(t1),len(t2)):
                                continue

                            # term term
                            cnt((t1,t2), 't.t', doc['doi'])

                # author loop
                for a in doc['authors']:
                    # citation author
                    cnt((cited,a), 'c.fa', doc['doi'])

                    # year author journal
                    cnt((a, doc['journal'], doc['year']), 'fa.fj.fy', doc['doi'])

                    # author
                    cnt((a,), 'fa', doc['doi'])

                # add to counters for citation-citation counts
                for cited1 in c['citations']:
                    for cited2 in c['citations']:
                        if cited1 >= cited2:
                            continue

                        cnt(( cited1, cited2 ), 'c.c', doc['doi'])
                        cnt(( cited1, cited2, doc['year'] ), 'c.c.fy', doc['doi'])


# Master counting cell

This cell is **long-running**

In [None]:
def getname(x):
    x = x.split("/")[-1]
    x = re.sub(r'(\.xml|\.txt)','',x)
    return x

def jstor_file_iterator_1(zipfiles):
    from random import shuffle
    
    all_files = []
    for zf in zipfiles:
        archive = ZipFile(zf, 'r')
        files = archive.namelist()
        names = list(set(getname(x) for x in files))
        
        all_files += [(archive,name) for name in names]
        
    shuffle(all_files)
        
    for archive, name in all_files:
        try:
            yield(
                name.split("-")[-1].replace("_", "/"),
                archive.read("metadata/%s.xml" % name),
                archive.read("ocr/%s.txt" % name).decode('utf8')
            )
        except KeyError: # some very few articles don't have both
            continue
            
def jstor_file_iterator(zipfiles):
    for i, (doi, metadata_str, ocr_str) in enumerate(jstor_file_iterator_1(zipfiles)):
        try:
            drep = clean_metadata( doi, metadata_str )


            # only include journals in the list "included_journals"
            if use_included_journals_filter and (drep['journal'] not in included_journals):
                continue

            if debug: print("got meta")

            if drep['type'] != 'research-article':
                continue

            # some types of titles should be immediately ignored
            def title_looks_researchy(lt):
                lt = lt.lower()
                lt = lt.strip()

                for x in ["book review", 'review essay', 'back matter', 'front matter', 'notes for contributors', 'publication received', 'errata:', 'erratum:']:
                    if x in lt:
                        return False

                for x in ["commentary and debate", 'erratum', '']:
                    if x == lt:
                        return False

                return True

            lt = drep['title'].lower()
            if not title_looks_researchy(lt):
                continue

            # Don't process the document if there are no authors
            if not len(drep['authors']):
                continue

            drep['content'] = get_content_string(ocr_str)

            drep['citations'] = []

            # loop through the matching parentheses in the document
            for index, (parenStart, parenContents) in enumerate(getOuterParens(drep['content'])):

                citations = list(citation_iterator(parenContents))
                if not len(citations):
                    continue


                citation = {
                    "citations": citations,
                    "contextLeft": drep['content'][parenStart-400+1:parenStart+1],
                    "contextRight": drep['content'][parenStart + len(parenContents) + 1:parenStart + len(parenContents) + 1 + 100],
                    "where": parenStart
                }


                # cut off any stuff before the first space
                first_break_left = re.search(r"[\s\.!\?]+", citation['contextLeft'])
                if first_break_left is not None:
                    clean_start_left = citation['contextLeft'][first_break_left.end():]
                else:
                    clean_start_left = citation['contextLeft']

                # cut off any stuff after the last space
                last_break_right = list(re.finditer(r"[\s\.!\?]+", citation['contextRight']))
                if len(last_break_right):
                    clean_end_right = citation['contextRight'][:last_break_right[-1].start()]
                else:
                    clean_end_right = citation['contextRight']

                # we don't want anything more than a sentence

                sentence_left = sent_tokenize(clean_start_left)
                if len(sentence_left):
                    sentence_left = sentence_left[-1]
                else:
                    sentence_left = ""

                sentence_right = sent_tokenize(clean_end_right)[0]
                if len(sentence_right):
                    sentence_right = sentence_right[0]
                else:
                    sentence_right = ""

                # finally, strip the parentheses from the string
                sentence_left = sentence_left[:-1]
                sentence_right = sentence_right[1:]

                # add the thing in context
                full = sentence_left + "<CITATION>" + sentence_right

                citation['contextPure'] = sentence_left
                #print(full)

                drep['citations'].append(citation)
            
            yield doi, drep
            
        except ParseError as e:
            print("parse error...", e.args, doi)

In [15]:
seen = set()

skipped = 0

total_count = Counter()
doc_count = Counter()
pair_count = Counter()

debug = False


for i, (doi, drep) in enumerate( jstor_file_iterator(zipfiles) ):

    if i % 1000 == 0:
        print("Document", i, "...", 
              len(cnt_doc['fj'].keys()), "journals...", 
              len(cnt_doc['c'].keys()), "cited works...", 
              len(cnt_doc['fa'].keys()), "authors...",
              len(cnt_doc['t'].keys()), "terms used...",
              citations_skipped, "skipped citations...",
              cnt_doc['t'][('social',)], "'social' terms"
             )

    # sometimes multiple journal names map onto the same journal, for all intents and purposes
    if drep['journal'] in journal_map:
        drep['journal'] = journal_map[drep['journal']]  

    # only include journals in the list "included_journals"
    if use_included_journals_filter and (drep['journal'] not in included_journals):
        continue
        
        
    # now that we have all the information we need,
    # we simply need to "count" this document in a few different ways
    account_for(drep)

Document 0 ... 0 journals... 0 cited works... 0 authors... 0 terms used... 0 skipped citations... 0 'social' terms
Document 1000 ... 37 journals... 4005 cited works... 383 authors... 64272 terms used... 2088 skipped citations... 132 'social' terms
Document 2000 ... 40 journals... 8208 cited works... 754 authors... 127737 terms used... 4635 skipped citations... 274 'social' terms
Citation 15075
Term 138929
consolidating... removing 133929 e.g. ['benefit-value', 'conditions-members', 'linked-violence', 'elaborate-independent', 'decoupled']
pruning 't'...
old size: 138929
new size: 5000
pruning 'fy.t'...
old size: 210650
new size: 56089
pruning 'fj.t'...
old size: 159555
new size: 39169
pruning 'c.t'...
old size: 371000
new size: 157187
final terms:  courts, measures-two, normal, however, number-total, intergenerational-mobility, perspectives-sociological, acceptable, minimum, conclude, actually, corporate, cases-many, data-present, time-work, political-religious, phenomenon, unequal, sim

In [16]:
list(cnt_doc['t'])[:5]

[('social',), ('played',), ('ever',), ('candidates',), ('money',)]

In [17]:
len([x for x in cnt_doc['t'] if not '-' in x[0]])

2500

In [18]:
min(list(cnt_doc['t'].values()))

4

In [19]:
for k,v in cnt_doc.items():
    print(k, len(v))

fj 41
c 141608
fa 37329
t 5000
fy 104
fj.fy 1849
c.fj 463963
c.fy 588668
fy.t 238387
c.t 9661193
fj.t 164358
c.fa 1328474
fa.fj.fy 68269
c.c 1019555
c.c.fy 1122173
t.t 10174838


# Save the database

In [20]:
save_cnt("%s.doc"%database_name, cnt_doc)

Saving sociology-jstor-basicall.doc ___ fj
Saving sociology-jstor-basicall.doc ___ c
Saving sociology-jstor-basicall.doc ___ fa
Saving sociology-jstor-basicall.doc ___ t
Saving sociology-jstor-basicall.doc ___ fy
Saving sociology-jstor-basicall.doc ___ fj.fy
Saving sociology-jstor-basicall.doc ___ c.fj
Saving sociology-jstor-basicall.doc ___ c.fy
Saving sociology-jstor-basicall.doc ___ fy.t
Saving sociology-jstor-basicall.doc ___ c.t
Saving sociology-jstor-basicall.doc ___ fj.t
Saving sociology-jstor-basicall.doc ___ c.fa
Saving sociology-jstor-basicall.doc ___ fa.fj.fy
Saving sociology-jstor-basicall.doc ___ c.c
Saving sociology-jstor-basicall.doc ___ c.c.fy
Saving sociology-jstor-basicall.doc ___ t.t


In [21]:
save_cnt("%s.ind"%database_name, cnt_ind)

Saving sociology-jstor-basicall.ind ___ fy
Saving sociology-jstor-basicall.ind ___ c
Saving sociology-jstor-basicall.ind ___ fj
Saving sociology-jstor-basicall.ind ___ fj.fy
Saving sociology-jstor-basicall.ind ___ c.fj
Saving sociology-jstor-basicall.ind ___ c.fy
Saving sociology-jstor-basicall.ind ___ t
Saving sociology-jstor-basicall.ind ___ fy.t
Saving sociology-jstor-basicall.ind ___ c.t
Saving sociology-jstor-basicall.ind ___ fj.t
Saving sociology-jstor-basicall.ind ___ c.fa
Saving sociology-jstor-basicall.ind ___ fa.fj.fy
Saving sociology-jstor-basicall.ind ___ fa
Saving sociology-jstor-basicall.ind ___ c.c
Saving sociology-jstor-basicall.ind ___ c.c.fy
Saving sociology-jstor-basicall.ind ___ t.t


In [2]:
comments()