# Homework 10: Cleaning data with Regular Expressions

Time to use regular expressions!

# Hints and notes

### Opening files in subdirectories

Notice that this notebook might be **homework/**, but!!! the csvs and text files might be in **homework/scraped/** or **/homework/scraped/minutes_pdfs** or **/homework/pdfs/**. To open a file in a subdirectory, instead of having the filename be `"file.csv"` you'll just use `"some/subfolder/file.csv"`

### Opening text files

This will open up a file, read it in and show you the first 500 characters.

```python
contents = open("your-filename.txt").read()
contents[0:500]
```

> You might need `open("your-filename.txt", encoding="utf8").read()`

### Using regex

For some dumb reason you need to put `r` in front of the string you use when you're talking about regex. Just plain `"(\d\d\d)"` will usually work, but *sometimes* it won't and you'll need `r"(\d\d\d)`. It's best to just use the `r` all of the time, if you can remember!

### Using `.str.extract`

When you use `.str.extract`, you're always going to **capture one thing** and save it to a new column. You need to wrap the things you're interested in with parenthesis `(` `)`.

```python
df['phone_number'] = df['old_column'].str.extract(r"My phone number is (\d\d\d-\d\d\d-\d\d\d\d)")
```

### Setting pandas options

Pandas has a lot of options, like how many columns or rows it will show you, or how many characters it will show in a column before it stops showing you anything. Here are a few useful ones:

* `display.max_cols`: Number of columns to show at once
* `display.max_rows`: Number of rows to show at once
* `display.max_colwidth`: Maximum number of characters displayed from a string

You can set them using `pd.set_option("display.max_rows", 1000)`, for example, to show 1000 rows at a time. You can find a lot more at https://pandas.pydata.org/pandas-docs/stable/generated/pandas.set_option.html

### Regular expressions reference

I personally think http://www.regular-expressions.info/ is a wonderful wonderful reference (and tutorial), even if it's ugly! But here's a quick reference for you:

* `\d` is a digit
* `\d*` is zero or more digits 
* `\d+` is one or more digits
* `.` matches anything for ONE character
* `.*` is "give me anything forever"
* `\s` is whitespace, a.k.a. spaces and tabs
* `\w` is a word character, which includes capital and lowercase letters, numbers and hyphens.
* You can put `*` after anything, so `\w*` would mean "as many word characters as you can find"
* `\b` is a word boundary (you'll need the `r""` thing for this one)
* `( )` is a "capture group" for saving something
* `\1` is used when doing find/replace to say "put the first captured group here" (note, it's a dollar sign instead of a backslash in some editors)
* `[ABCDE]` is a character class, which means "match one of these, I don't care which"
* dollar sign means "end of the line"
* caret ^ means "beginning of the line"
* `\.` means "no really seriously I mean a period not just anything"
* You can use `\` with anything else that would normally be a special character, too, not just periods. `(` or `[` or whatever.

### Cleaning up extracted columns

Sometimes you get `\n` (newlines) or spaces or `\t` (tabs) or stuff at the beginning or the end of your column. `.str.strip()` will usually take care of that, just attach it after your `.str.extract()`

After you extract something, it's still a string even though you look at it and know it's a number. Use `.astype(int)` to turn it into an integer (no decimal) or `.astype(float)` to turn it into a float (yes decimal)

### Writing regular expressions in general

Even if I'm using regex in pandas or Python, I like to test them in my text editor with "Find." The highlighting really helps me see if I'm matching things! I also like to think "what stays the same?" when designing patterns, write those parts first, then fill in the blanks with what I want to capture.

## Importing

There might be more, I just wanted to put this up here for the `pd.set_option` part. It allows you to see a lot of content in a single column of pandas, which will be important for some parts below.

In [1]:
import re
import pandas as pd
pd.set_option('display.max_colwidth', 500)

# Part 1: Using `.str.extract` to pull data from columns in pandas

## 1.1 H&M

Open up `hm.csv` from the `scraped` directory. I want **four new columns**:

1. `price_original`, the original price, one of the new price
2. `price_discounted`, the discounted price
3. `pct_discount`, the percent discount
4. `article_id`, the article id (from the url)

Save as **hm_cleaned.csv**.

**Note:** When you look at it, it... won't look right. I don't know why, pandas is weird. Look at the `price` column by itself using `df['price']` before you write your regex.

**Tip:** Remember that `$` is a special regex symbol! You might need to escape it.

**Tip:** When doing `.str.extract`, the whole match doesn't get captured, only what you put `()` around! Think about anchoring to different points of the string, or things in the string.

**Tip:** Not all prices have cents!

**Tip:** Your first instinct about how to compute the percent discount is probably wrong

In [2]:
df = pd.read_csv("scraped/hm.csv")



In [3]:
df.head()

Unnamed: 0,name,price,url
0,Washed Linen Duvet Cover Set,$59.99 $129,http://www.hm.com/us/product/13472?article=13472-N
1,Candle in Glass Jar,$6.99 $17.99,http://www.hm.com/us/product/35079?article=35079-D
2,Glittery Cushion Cover,$7.99 $17.99,http://www.hm.com/us/product/72462?article=72462-A
3,Textured-weave Cushion Cover,$6.99 $12.99,http://www.hm.com/us/product/58926?article=58926-C
4,Stoneware Bowl,$17.99 $24.99,http://www.hm.com/us/product/74242?article=74242-A


In [4]:
#extracting article id
df['article_id'] = df.url.str.extract("(\d\d\d\d\d-\w)")
df['article_id']

0     13472-N
1     35079-D
2     72462-A
3     58926-C
4     74242-A
5     70965-D
6     62818-B
7     69163-B
8     40910-C
9     69699-B
10    76916-A
11    68473-C
12    74498-A
13    41512-E
14    76727-A
15    76977-A
16    73369-A
17    59417-D
18    76797-A
19    76794-A
20    80432-A
21    78352-A
22    70658-A
23    72111-A
24    77665-A
25    70603-A
26    72262-B
27    70427-A
28    72262-A
29    77665-B
30    73380-A
31    35079-C
32    74773-A
33    75033-A
34    60637-C
35    70306-A
36    60621-B
37    69643-A
38    72455-B
39    72460-B
40    72462-B
41    71871-A
42    73734-A
43    74267-A
44    73344-A
45    73325-C
46    74264-A
47    72455-A
48    70799-B
49    68473-D
50    72111-C
51    58926-B
52    58211-A
53    75469-A
54    76982-A
55    74825-A
56    74263-A
57    78047-A
58    60637-G
59    78381-A
Name: article_id, dtype: object

In [5]:
df['price'].head()

0      $59.99 $129
1     $6.99 $17.99
2     $7.99 $17.99
3     $6.99 $12.99
4    $17.99 $24.99
Name: price, dtype: object

In [6]:
#extracting original price
df['price_original'] = df.price.str.extract(" \$(\d\d?\.?\d?\d?)")
df['price_original'] = pd.to_numeric(df['price_original'])
df['price_original'].dtype

dtype('float64')

In [7]:
#extracting discounted price
df['price_discounted'] = df.price.str.extract("\$(\d\d?\.?\d?\d?)")
df['price_discounted'] = pd.to_numeric(df['price_discounted'])
df['price_discounted'].dtype

dtype('float64')

In [8]:
#calculating the discount percentage
df['pct_discount'] = round((df['price_original'] - df['price_discounted']) / df['price_original'] * 100) 
df['pct_discount'].head()

0    53.0
1    61.0
2    56.0
3    46.0
4    28.0
Name: pct_discount, dtype: float64

In [9]:
df.head()

Unnamed: 0,name,price,url,article_id,price_original,price_discounted,pct_discount
0,Washed Linen Duvet Cover Set,$59.99 $129,http://www.hm.com/us/product/13472?article=13472-N,13472-N,129.0,59.99,53.0
1,Candle in Glass Jar,$6.99 $17.99,http://www.hm.com/us/product/35079?article=35079-D,35079-D,17.99,6.99,61.0
2,Glittery Cushion Cover,$7.99 $17.99,http://www.hm.com/us/product/72462?article=72462-A,72462-A,17.99,7.99,56.0
3,Textured-weave Cushion Cover,$6.99 $12.99,http://www.hm.com/us/product/58926?article=58926-C,58926-C,12.99,6.99,46.0
4,Stoneware Bowl,$17.99 $24.99,http://www.hm.com/us/product/74242?article=74242-A,74242-A,24.99,17.99,28.0


In [10]:
#dropping the old price column
df.drop('price', axis=1, inplace=True)


In [11]:
df.head()

Unnamed: 0,name,url,article_id,price_original,price_discounted,pct_discount
0,Washed Linen Duvet Cover Set,http://www.hm.com/us/product/13472?article=13472-N,13472-N,129.0,59.99,53.0
1,Candle in Glass Jar,http://www.hm.com/us/product/35079?article=35079-D,35079-D,17.99,6.99,61.0
2,Glittery Cushion Cover,http://www.hm.com/us/product/72462?article=72462-A,72462-A,17.99,7.99,56.0
3,Textured-weave Cushion Cover,http://www.hm.com/us/product/58926?article=58926-C,58926-C,12.99,6.99,46.0
4,Stoneware Bowl,http://www.hm.com/us/product/74242?article=74242-A,74242-A,24.99,17.99,28.0


In [12]:
#saving as csv
df.to_csv('hm_cleaned.csv', index=False)

## 1.2 Sci-Fi Authors

Open up `sci-fi.csv` to clean. Get rid of the `\n` on the title and and give me six new columns:

* `avg_rating`
* `rating_count`
* `total_score`
* `score_votes`
* `series` the series the book belongs to
* `series_no` the book in the series that it is

For series, I'm talking about e.g. `(The Hunger Games, #1)` is `series` "The Hunter Games" and `series_no` 1.

Save as **sci-fi_cleaned.csv**.

**Tip:** You don't need regex to clean the title - there's a special thing that removes whitespace from the beginning/end of strings

**Tip:** Remember that `(` and `)` are special characters

**BONUS:** When you make the `total_score` column, pay close attention to it. If you notice the problem, fix it.

**BONUS:** You don't need these columns to be numbers, but life would be better if they were. 

In [13]:
df2 = pd.read_csv("scraped/sci-fi.csv")

In [14]:
df2

Unnamed: 0,full_rating,full_score,rank,title,url
0,"4.07 avg rating — 785,502 ratings","\nscore: 28,539,\n and\n292 people voted\n \n \n",1,\nThe Handmaid's Tale\n,/book/show/38447.The_Handmaid_s_Tale
1,"4.34 avg rating — 5,212,935 ratings","\nscore: 27,566,\n and\n282 people voted\n \n \n",2,"\nThe Hunger Games (The Hunger Games, #1)\n",/book/show/2767052-the-hunger-games
2,"3.76 avg rating — 922,308 ratings","\nscore: 20,049,\n and\n205 people voted\n \n \n",3,"\nFrankenstein, or The Modern Prometheus\n",/book/show/18490.Frankenstein_or_The_Modern_Prometheus
3,"4.04 avg rating — 702,272 ratings","\nscore: 17,684,\n and\n185 people voted\n \n \n",4,"\nA Wrinkle in Time (A Wrinkle in Time Quintet, #1)\n",/book/show/18131.A_Wrinkle_in_Time
4,"4.06 avg rating — 77,664 ratings","\nscore: 16,070,\n and\n165 people voted\n \n \n",5,\nThe Left Hand of Darkness\n,/book/show/18423.The_Left_Hand_of_Darkness
5,"4.23 avg rating — 2,345,974 ratings","\nscore: 12,935,\n and\n134 people voted\n \n \n",6,"\nDivergent (Divergent, #1)\n",/book/show/13335037-divergent
6,"4.30 avg rating — 2,049,239 ratings","\nscore: 12,261,\n and\n128 people voted\n \n \n",7,"\nCatching Fire (The Hunger Games, #2)\n",/book/show/6148028-catching-fire
7,"4.12 avg rating — 1,379,452 ratings","\nscore: 11,238,\n and\n117 people voted\n \n \n",8,"\nThe Giver (The Giver, #1)\n",/book/show/3636.The_Giver
8,"4.19 avg rating — 57,605 ratings","\nscore: 10,246,\n and\n107 people voted\n \n \n",9,\nThe Dispossessed\n,/book/show/13651.The_Dispossessed
9,"4.20 avg rating — 53,473 ratings","\nscore: 9,907,\n and\n104 people voted\n \n \n",10,\nKindred\n,/book/show/60931.Kindred


In [15]:
#extracting series number
df2['series_no'] = df2.title.str.extract(", (#\d)\)")
df2['series_no'].head()

0    NaN
1     #1
2    NaN
3     #1
4    NaN
Name: series_no, dtype: object

In [16]:
#extracting series name
df2['series'] = df2.title.str.extract("\((.*), ")
df2['series'].head()

0                          NaN
1             The Hunger Games
2                          NaN
3    A Wrinkle in Time Quintet
4                          NaN
Name: series, dtype: object

In [17]:
#extracting score votes
df2['score_votes'] = df2.full_score.str.extract("(\d?\d?\d?) p")
df2['score_votes'].head()

0    292
1    282
2    205
3    185
4    165
Name: score_votes, dtype: object

In [18]:
#extracting total score
df2['total_score'] = df2.full_score.str.extract(": (\d?\d?\,?\d?\d?\d?)")
df2['total_score'].head() 

0    28,539
1    27,566
2    20,049
3    17,684
4    16,070
Name: total_score, dtype: object

In [19]:
#extracting rating count
df2['rating_count'] = df2.full_rating.str.extract("(\d?\d?\d?\,?\d?\d?\d?) ratings")
df2['rating_count'].head()

0    785,502
1    212,935
2    922,308
3    702,272
4     77,664
Name: rating_count, dtype: object

In [20]:
#extracting average rating
df2['avg_rating'] = df2.full_rating.str.extract("(\d\.\d\d)")
df2['avg_rating'].head()

0    4.07
1    4.34
2    3.76
3    4.04
4    4.06
Name: avg_rating, dtype: object

In [21]:
#extracting title
df2['title'] = df2['title'].str.strip()
df2['title'].head()

0                                  The Handmaid's Tale
1              The Hunger Games (The Hunger Games, #1)
2               Frankenstein, or The Modern Prometheus
3    A Wrinkle in Time (A Wrinkle in Time Quintet, #1)
4                            The Left Hand of Darkness
Name: title, dtype: object

In [22]:
df2.head()

Unnamed: 0,full_rating,full_score,rank,title,url,series_no,series,score_votes,total_score,rating_count,avg_rating
0,"4.07 avg rating — 785,502 ratings","\nscore: 28,539,\n and\n292 people voted\n \n \n",1,The Handmaid's Tale,/book/show/38447.The_Handmaid_s_Tale,,,292,28539,785502,4.07
1,"4.34 avg rating — 5,212,935 ratings","\nscore: 27,566,\n and\n282 people voted\n \n \n",2,"The Hunger Games (The Hunger Games, #1)",/book/show/2767052-the-hunger-games,#1,The Hunger Games,282,27566,212935,4.34
2,"3.76 avg rating — 922,308 ratings","\nscore: 20,049,\n and\n205 people voted\n \n \n",3,"Frankenstein, or The Modern Prometheus",/book/show/18490.Frankenstein_or_The_Modern_Prometheus,,,205,20049,922308,3.76
3,"4.04 avg rating — 702,272 ratings","\nscore: 17,684,\n and\n185 people voted\n \n \n",4,"A Wrinkle in Time (A Wrinkle in Time Quintet, #1)",/book/show/18131.A_Wrinkle_in_Time,#1,A Wrinkle in Time Quintet,185,17684,702272,4.04
4,"4.06 avg rating — 77,664 ratings","\nscore: 16,070,\n and\n165 people voted\n \n \n",5,The Left Hand of Darkness,/book/show/18423.The_Left_Hand_of_Darkness,,,165,16070,77664,4.06


In [23]:
df2.drop('full_rating', axis=1, inplace=True)


In [24]:
df2.drop('full_score', axis=1, inplace=True)


In [25]:
df2.head()

Unnamed: 0,rank,title,url,series_no,series,score_votes,total_score,rating_count,avg_rating
0,1,The Handmaid's Tale,/book/show/38447.The_Handmaid_s_Tale,,,292,28539,785502,4.07
1,2,"The Hunger Games (The Hunger Games, #1)",/book/show/2767052-the-hunger-games,#1,The Hunger Games,282,27566,212935,4.34
2,3,"Frankenstein, or The Modern Prometheus",/book/show/18490.Frankenstein_or_The_Modern_Prometheus,,,205,20049,922308,3.76
3,4,"A Wrinkle in Time (A Wrinkle in Time Quintet, #1)",/book/show/18131.A_Wrinkle_in_Time,#1,A Wrinkle in Time Quintet,185,17684,702272,4.04
4,5,The Left Hand of Darkness,/book/show/18423.The_Left_Hand_of_Darkness,,,165,16070,77664,4.06


In [49]:
df.to_csv('sci-fi_cleaned.csv', index=False)

## 1.3 Where you're just doing one of my former students' projects

Once upon a time my student Stefan did a project that involved some lawyer stuff. Most of the content was in PDFs, though! I converted them to text files and put them into the `pdfs` folder, and gave you code below to open up each of them and save their contents into a dataframe.

What a nice dataframe! I want you to add the following columns to it:

* `lawyer_app`, the applicant's lawyer (pro se means that they did it themselves, that's fine)
* `lawyer_gov`, the government's lawyer
* `judge`, the name of the judge
* `access`, whether the clearance is granted or denied (although you might miss a few)

Save as **court_cleaned.csv**.

**Note:** You can look at the original PDFs, they're also included.

**Note:** This uses a fun utility called `glob`, which is mostly fun because you use it as `glob.glob`. It's used to find files that match a certain filename pattern.

**BONUS:** You'll be happy once you get the judge, but make sure it doesn't have any extra punctuation on it.

**BONUS:** You can for some words using `.str.contains("blah")` and save it into new columns. Maybe `has_debt`, `has_bankruptcy`, etc.

> It's okay if it isn't perfect. Converting PDF into data rarely is! Usually you get 90% of it done with computers, then send people to enter the other 10% by hand.

In [26]:
import glob
filenames = glob.glob("pdfs/*.txt")
contents = [open(filename, encoding="utf8").read() for filename in filenames]
df3 = pd.DataFrame({'filename': filenames, 'content': contents})

In [27]:

pd.set_option('display.max_colwidth', -1)





In [28]:
#extracting access info
df3['access'] = df3.content.str.extract(r"information is (\w+)")
df3['access'] 
                    

0     NaN    
1     NaN    
2     granted
3     NaN    
4     denied 
5     denied 
6     denied 
7     denied 
8     granted
9     denied 
10    granted
11    NaN    
12    denied 
13    granted
14    granted
Name: access, dtype: object

In [29]:
#extracting judge name
df3['judge'] = df3.content.str.extract(r"(\w+, \w+ ?\w?.?), Admin")
df3['judge']

0     MENDEZ, Francisco     
1     NaN                   
2     LOUGHRAN, Edward W.   
3     DUFFY, James F.       
4     NaN                   
5     HOWE, Philip S.       
6     COACHER, Robert E.    
7     DUFFY, James F.       
8     NaN                   
9     GOLDSTEIN, Jennifer I.
10    HOGAN, Erin C.        
11    NaN                   
12    MOGUL, Martin H.      
13    HARVEY, Mark          
14    NaN                   
Name: judge, dtype: object

In [30]:
#extracting applicant's lawyer
df3['lawyer_app'] = df3.content.str.extract(r"cant: (.*)") 
df3['lawyer_app'] = df3.content.str.extract(r"cant: (.*)") 
df3['lawyer_app']

0     Pro se                  
1     NaN                     
2     Mark S. Zaid, Esq.      
3     Pro se                  
4     Pro se                  
5     Pro se                  
6     Pro se                  
7     Pro se                  
8     Ryan C. Nerney, Esquire 
9     Pro Se                  
10    Pro se                  
11    NaN                     
12    Pro se                  
13    Mark S. Zaid, Esq.      
14    Stephen Glassman, Esq.  
Name: lawyer_app, dtype: object

In [31]:
#extracting government's lawyer
df3['lawyer_gov'] = df3.content.str.extract("ment: (.*), E")                                            
df3['lawyer_gov']                                     

0     David F. Hayes     
1     NaN                
2     Robert J. Kilmartin
3     Richard Stevens    
4     Julie R. Mendez    
5     David F. Hayes     
6     Stephanie C. Hess  
7     Tara Karoian       
8     Richard Stevens    
9     Robert Kilmartin   
10    Eric Borgstrom     
11    NaN                
12    Tovah Minster      
13    Julie R. Mendez    
14    Erin Thompson      
Name: lawyer_gov, dtype: object

Okay, now do the work and **make those new columns!**

In [32]:
df3.head()

Unnamed: 0,filename,content,access,judge,lawyer_app,lawyer_gov
0,pdfs/11-11916.h1.pdf.txt,"\n\nDEPARTMENT OF DEFENSE \n\nDEFENSE OFFICE OF HEARINGS AND APPEALS \n\n \n \n\n \n \nIn the matter of: \n \n \n \nApplicant for Security Clearance \n\nREDACTED \n\n \n\nISCR Case No. 11-11916 \n\nMENDEZ, Francisco, Administrative Judge: \n\n \nApplicant did not mitigate security concerns raised by his exercise of foreign \ncitizenship, including the possession of a current foreign passport. He also did not \nmitigate security concerns raised by his substantial ties to Russia through which he \ncould be subjected to adverse foreign influence. Clearance is denied. \n \n\nHistory of the Case \n\nOn August 15, 2009, Applicant submitted a security clearance application (SCA). \nHe voluntarily disclosed his dual U.S.-Russian citizenship, as well as his connections \nand property interest in Russia. \n\n \nOn April 17, 2014, the Department of Defense Consolidated Adjudications \nFacility sent Applicant a Statement of Reasons (SOR) alleging that his circumstances \nraised security concerns under the foreign preference and foreign influence guidelines.1 \n \n1 This action was taken under Executive Order (E.O.) 10865, Safeguarding Classified Information within \nIndustry (February 20, 1960), as amended; Department of Defense Directive 5220.6, Defense Industrial \nPersonnel Security Clearance Review Program (January 2, 1992), as amended (Directive); and the \nAdjudicative Guidelines implemented by the Department of Defense on September 1, 2006. \n\nFor Government: David F. Hayes, Esq., Department Counsel \n\nFor Applicant: Pro se \n\nAppearances \n\n______________ \n\n \nDecision \n\n______________ \n\n \n\n \n \n\n) \n) \n) \n) \n) \n \n \n\n \n \n\n \n\n \n1 \n \n \n\nOn July 26, 2014, Applicant answered the SOR, admitted all the SOR allegations, \nwaived his right to a hearing, and elected to have his case decided on the written \nrecord.2 \n \n \nOn July 22, 2015, Department Counsel prepared a file of relevant material \n(FORM) and sent it to Applicant. The FORM contains the SOR, Applicant’s answer, and \nresponses to two interrogatories, which were admitted into the record as Exhibits 1 – 4. \nDepartment Counsel also submitted with the FORM a request for administrative notice, \nExhibit (Ex.) 5, which is discussed below. \n \n \nprovided 30 days from its receipt to file a response, but did not submit one. \n \n \nOn December 1, 2015, I was assigned Applicant’s case and provided a copy of \nthe FORM. On my own motion, I opened the record to provide him a last chance \nopportunity to submit a response to the FORM. He was also advised of the serious \nsecurity concerns raised by his possession of a current foreign passport and that such \nconcerns may be mitigated by surrendering or relinquishing the passport as set forth in \nthe Directive.4 Applicant did not submit a response or provide additional documentation. \nThe record closed on December 15, 2015. \n \n\nOn August 31, 2015, Applicant acknowledged receipt of the FORM.3 He was \n\nAdministrative Notice: The Russian Federation (Russia) \n\nDOHA administrative \n\njudges may accept \n\n \n \nfor administrative notice \nuncontroverted, easily verifiable facts regarding a foreign country from official U.S. \nGovernment reports. Additionally, the official position of relevant federal agencies or the \npertinent statements of key U.S. Government officials regarding a foreign country may \nbe appropriate for administrative notice. The party requesting administrative notice of a \nparticular matter must provide the source document, either the pertinent parts or the full \ndocument, to allow the judge and, if necessary, the Appeal Board to assess the \nreliability, accuracy, and relevancy of any administratively noticed fact. See generally, \nISCR Case No. 08-09480 (App. Bd. Mar. 17, 2010); ISCR Case No. 05-11292 (App. Bd. \nApr. 12, 2007). \n \n\nDepartment Counsel did not submit the source documents (or, relevant portions \nthereof) with the FORM. Instead, Department Counsel’s request for administrative \nnotice, Ex. 5, cites to the web addresses where the source documents can be located. \nRecently, the Appeal Board held that, irrespective of whether an applicant raises an \nobjection to a matter requested for administrative notice, citation to a web address alone \nis insufficient. The Board went on to reiterate its long-held position that the actual \nsource document the judge relies upon for an administratively noticed fact must be \n\n \n2 Hearing Exhibit I. \n \n3 Hearing Exhibit II. \n \n4 Hearing Exhibit III. \n\n \n2 \n \n \n\nmade a part of the record. See ISCR Case No. 14-01655 (App. Bd. Nov. 3, 2015), case \nremanded because the source documents not included in the record.5 \n\n \nAccordingly, I have marked and included in the record as Hearing Exhibit IV the \nsource documents (or, the pertinent portions thereof), which provide a basis for the \nfollowing relevant facts regarding Russia:6 \n \nRussia “has a highly centralized, weak multi-party political system dominated by \n \nPresident Vladimir Putin.”7 A recent human rights report from the U.S. State Department \nreflects that the Russian government committed significant human rights violations and \n“the [Russian] government failed to take adequate steps to prosecute or punish most \nofficials who committed abuses, resulting in a climate of impunity.”8 \n \n \nIn 2015, the Director of National Intelligence reported to Congress that the \nleading state intelligence threats to the United States will continue to come from two \nmain countries, one of which is Russia.9 \n \n\n \nApplicant was born, raised, and educated in Russia. He received a commission \nas an officer in Russia’s reserve military forces and worked for a time for the Russian \nDefense Ministry. He immigrated to the United States in 1997 and married his wife, who \nis also originally from Russia, in 2002. They have one child, who was born in the United \nStates. \n \n\nApplicant became a U.S. citizen in 2005. Applicant, his wife, and his child all \nhave dual U.S.-Russian citizenships. Since becoming a U.S. citizen, Applicant has \n \n5 Department Counsel’s “failure” to submit the source documents is understandable as the FORM \npredates the cited Appeal Board decision. Arguably, Ex. 5 could be admitted as a summary of the \npertinent facts contained in the source documents. See generally, Directive, Enclosure 3, ¶ E3.1.19 \n(Federal Rules of Evidence (F.R.E.) shall serve as a guide in DOHA proceedings and technical rules of \nevidence may be relaxed to permit the development of a full and complete record); F.R.E. 201; F.R.E. \n1006. However, based on the present record, I cannot find that Applicant’s failure to raise an objection to \nEx. 5 amounts to an agreement as to the exhibits accuracy, reliability, and relevancy. Contrast with, ISCR \nCase No. 14-03112 (App. Bd. Nov. 3, 2015), after applicant concurred with its content, the only evidence \nregarding the foreign country was the administrative notice request that was admitted as a summary. \n \n6 Applicant was provided notice regarding these source documents with the FORM. He was also provided \nample opportunity to challenge or provide additional information regarding the matters requested by the \nGovernment for administrative notice. Although the Government’s citation to 26 source documents raises \npotential notice and fairness concerns, the additional time Applicant was provided to respond and provide \nadditional information ameliorated any such concerns. \n \n7 Director of National Intelligence, Statement for the Record, Senate Armed Services Committee, \nWorldwide Threat Assessment of the U.S. Intelligence Community at 4, February 26, 2015. \n \n8 U.S. State Department, Russia 2013 Human Rights Report at 1. \n \n9 Id. at 1-2. \n\nFindings of Fact \n\n \n\n \n3 \n \n \n\nvoted in Russian elections and twice renewed his Russian passport. His current \nRussian passport is due to expire in 2022. He is unwilling to surrender or relinquish his \nRussian passport because he may need it to travel to Russia on short notice to visit his \nelderly parents. \n \n\nApplicant has a number connections and contacts in Russia. Notably, Applicant’s \nparents and his wife’s mother and her siblings are citizens and residents of Russia. \nApplicant’s father used to work for the Russian government. His parents are now \nretired. Applicant has traveled to Russia to visit his family, with his most recent trip \noccurring in 2012. He has used his Russian passport to travel to Russia. He owns an \napartment in Russia, which he has left to his parents to dispose of as they see fit. He \nand his wife maintain contact with their relatives and at least one friend in Russia. \n\n \n\n \n\nPolicies \n\n“[N]o one has a ‘right’ to a security clearance.” Department of the Navy v. \nEgan, 484 U.S. 518, 528 (1988). Individual applicants are eligible for access to \nclassified information “only upon a finding that it is clearly consistent with the national \ninterest” to authorize such access. E.O. 10865 § 2. \n\n \nWhen evaluating an applicant’s eligibility \n\nfor a security clearance, an \nadministrative judge must consider the adjudicative guidelines (AG). In addition to brief \nintroductory explanations, the guidelines list potentially disqualifying and mitigating \nconditions. The guidelines are not inflexible rules of law. Instead, recognizing the \ncomplexities of human behavior, an administrative judge applies the guidelines in a \ncommonsense manner, considering all available and reliable information, in arriving at a \nfair and impartial decision. \n\n \nDepartment Counsel must present evidence to establish controverted facts \nalleged in the SOR. Directive ¶ E3.1.14. Applicants are responsible for presenting \n“witnesses and other evidence to rebut, explain, extenuate, or mitigate facts admitted by \nthe applicant or proven . . . and has the ultimate burden of persuasion as to obtaining a \nfavorable clearance decision.” Directive ¶ E3.1.15. \n\n \nAdministrative Judges are responsible for ensuring that due process proceedings \nare conducted “in a fair, timely and orderly manner.” Directive ¶ E3.1.10. Judges make \ncertain that an applicant receives fair notice of the issues raised, has a reasonable \nopportunity to litigate those issues, and is not subjected to unfair surprise. ISCR Case \nNo. 12-01266 at 3 (App. Bd. Apr. 4, 2014). \n\n \nIn resolving the ultimate question regarding an applicant’s eligibility, an \nadministrative judge must resolve “[a]ny doubt concerning personnel being considered \nfor access to classified information . . . in favor of national security.” AG ¶ 2(b). \nMoreover, recognizing the difficulty at times in making suitability determinations and the \nparamount importance of protecting national security, the Supreme Court has held that \n\n \n4 \n \n \n\n“security clearance determinations should err, if they must, on the side of denials.” \nEgan, 484 U.S. at 531. \n\nthe Government predicated upon \n\n \nA person who seeks access to classified information enters into a fiduciary \n \nrelationship with \ntrust and confidence. This \nrelationship transcends normal duty hours. The Government reposes a high degree of \ntrust and confidence in individuals to whom it grants access to classified information. \nDecisions include, by necessity, consideration of the possible risk an applicant may \ndeliberately or inadvertently fail to safeguard classified information. Such decisions \nentail a certain degree of legally permissible extrapolation of potential, rather than \nactual, risk of compromise of classified information. \n \n\nClearance decisions must be made “in terms of the national interest and shall \nin no sense be a determination as to the loyalty of the applicant concerned.” E.O. \n10865 § 7. Thus, a decision to deny a security clearance amounts to a finding that an \napplicant, at the time the decision was rendered, did not meet the strict guidelines \nestablished for determining eligibility for access to classified information. \n\n \n\nAnalysis \n\n \n\nGuideline C, Foreign Preference \n \n \nUnder AG ¶ 9, the foreign preference security concern arises “[w]hen an \nindividual acts in such a way as to indicate a preference for a foreign country over the \nUnited States.” Applicant actively exercised his foreign citizenship after becoming a U.S. \ncitizen, including voting in Russian elections and renewing, using, and maintaining a \nRussian passport. This record evidence raises the foreign preference security concern \nand establishes the disqualifying condition at AG ¶ 10(a).10 The foreign preference \nguideline also sets forth a number of potential mitigation conditions. I have considered \nall the mitigating conditions and none apply. Notably, Applicant currently has a foreign \npassport and is unwilling to relinquish or surrender it. A current or prospective clearance \nholder is ineligible for a security clearance unless s/he surrenders it to the cognizant \nsecurity authority (CSA), invalidates the foreign passport, or receives approval from the \nCSA for his/her continued possession and use of the foreign passport. See AG ¶¶ 11(d) \nand 11(e). Applicant did not supply any information that any of the preceding \ncircumstances apply. Accordingly, foreign preference security concerns remain. \n \nGuideline B, Foreign Influence \n \n\nThe foreign influence security concern is explained at AG ¶ 6: \n \nForeign contacts and interests may be a security concern if the individual \nhas divided loyalties or foreign financial interests, may be manipulated or \ninduced to help a foreign person, group, organization, or government in a \n\n \n10 Exercise of any right, privilege or obligation of foreign citizenship after becoming a U.S. citizen . . . This \nincludes . . . (1) possession of a current foreign passport; . . . (7) voting in a foreign election. \n\n \n5 \n \n \n\nis \n\nlimited \n\ninterest \n\nlocated, \n\nincluding, but not \n\nway that is not in U.S. interests, or is vulnerable to pressure or coercion by \nany foreign interest. Adjudication under this Guideline can and should \nconsider the identity of the foreign country in which the foreign contact or \nfinancial \nto, such \nconsiderations as whether the foreign country is known to target United \nStates citizens to obtain protected information and/or is associated with a \nrisk of terrorism.11 \n \n \nAn individual is not automatically disqualified from holding a security clearance \nbecause they have connections and interests in a foreign country. Instead, in assessing \nan individual’s vulnerability to foreign influence, an administrative judge must take into \naccount the foreign government involved; the intelligence gathering history of that \ngovernment; the country’s human rights record; and other pertinent factors.12 \n \nApplicant and his wife’s connections and property interest in Russia raise the \n \nforeign influence security concern. The record evidence, to include the matters \naccepted for administrative notice, establish the following disqualifying conditions: \n \n\n \n6 \n \n \n\nforeign \n\ncontact with a \n\nto protect sensitive \n\nAG ¶ 7(a): \nfamily member, business or \nprofessional associate, friend, or other person who is a citizen of or \nresident in a foreign country if that contact creates a heightened risk of \nforeign exploitation, inducement, manipulation, pressure, or coercion; \n \nAG ¶ 7(b): \nconnections to a foreign person, group, government, or \ncountry that create a potential conflict of interest between the individual’s \nobligation \nthe \nindividual’s desire to help a foreign person, group, or country by providing \nthat information; \n \nAG ¶ 7(d): \nsharing living quarters with a person or persons, regardless \nof citizenship status, if that relationship creates a heightened risk of \nforeign inducement, manipulation, pressure, or coercion; and \n \nAG ¶ 7(e): a substantial business, financial, or property interest in a \nforeign country, or in any foreign-owned or foreign-operated business, \nwhich could subject the individual to heightened risk of foreign influence or \nexploitation. \n\ntechnology and \n\ninformation or \n\n \n \nAn applicant with close family members and interests in a foreign country faces a \nhigh, but not insurmountable hurdle in mitigating security concerns raised by such \n \n11 ISCR Case No. 09-07565 at 3 (App. Bd. July 12, 2012) (“As the Supreme Court stated in Egan, a \nclearance adjudication may be based not only upon conduct but also upon circumstances unrelated to \nconduct, such as the foreign residence of an applicant’s close relatives.”) (emphasis added) (internal \ncitation omitted). \n \n12 ISCR Case No. 05-03250 at 4 (App. Bd. Apr. 6, 2007) (setting forth factors an administrative judge \nmust consider in foreign influence cases). \n\nforeign ties. Furthermore, an applicant is not required “to sever all ties with a foreign \ncountry before he or she can be granted access to classified information.”13 However, \nwhat factor or combination of factors will mitigate security concerns raised by an \napplicant with family members in a foreign country is not easily identifiable or \nquantifiable.14 An administrative judge’s predictive judgment in these types of cases \nmust be guided by a commonsense assessment of the evidence and consideration of \nthe adjudicative guidelines, as well as the whole-person factors set forth in the Directive. \nA judge’s ultimate determination must also take into account the overarching standard \nin all security clearance cases, namely, that any doubt raised by an applicant’s \ncircumstances must be resolved in favor of national security. AG ¶ 2(b). \n \n \nI have considered all the foreign influence mitigating conditions and, based on \nthe record evidence, none apply. Even if I assume for the sake of argument that \nApplicant’s property and non-familial connections in Russia are not significant enough to \ncause a conflict of interest with his security obligations, his and his wife’s close \nrelationship to their family members in Russia raise a concern about his vulnerability or \nsusceptibility to adverse foreign influence. Applicant’s relatives in Russia are subject to \nthe dictates of a government whose respect for human rights and the rule of law is, at \nbest, questionable. Moreover, Russia has been identified by the U.S. Government as \ncontinuing to pose an intelligence threat. Applicant’s familial connections in Russia, \ncoupled with the threat posed by the current Russian government to its own people and \nthe security threat it poses to the United States, raises a heightened risk that Applicant \ncould be subjected to adverse foreign influence. Applicant did not present sufficient \ninformation to mitigate this security concern. However, this adverse finding is “not a \ncomment on Applicant’s patriotism but merely an acknowledgment that people may act \nin unpredictable ways when faced with choices that could be important to a loved-one, \nsuch as a family member.” ISCR Case No. 08-10025 at 4 (App. Bd. Nov. 3, 2009). \n \nWhole-Person Concept \n \n \nUnder the whole-person concept, an administrative judge must evaluate an \napplicant’s eligibility for a security clearance by considering the totality of an applicant’s \nconduct and all the relevant circumstances. An administrative judge should consider the \nnine factors listed at AG ¶ 2(a).15 I hereby incorporate my comments under Guidelines \nB and C. I gave due consideration to all the favorable and extenuating factors in this \ncase, including Applicant’s honesty about his foreign connections from the start of the \nsecurity clearance process. Furthermore, I recognize that Applicant left Russia nearly \n \n13 ISCR Case No. 07-13739 at 4 (App. Bd. Nov. 12, 2008). \n \n14 ISCR Case No. 11-12202 at 5 (App. Bd. June 23, 2014). \n \n15 The non-exhaustive list of factors are: (1) the nature, extent, and seriousness of the conduct; (2) the \ncircumstances surrounding the conduct, to include knowledgeable participation; (3) the frequency and \nrecency of the conduct; (4) the individual’s age and maturity at the time of the conduct; (5) the extent to \nwhich participation is voluntary; (6) the presence or absence of rehabilitation and other permanent \nbehavioral changes; (7) the motivation for the conduct; (8) the potential for pressure, coercion, \nexploitation, or duress; and (9) the likelihood of continuation or recurrence. \n\n \n7 \n \n \n\n20 years ago and has made the United States his home. However, his possession of a \ncurrent Russian passport and close familial connections in Russia through whom he \ncould be adversely influenced raise serious security concerns. Accordingly, after \nweighing the favorable and unfavorable evidence, I find that he failed to mitigate the \nsecurity concerns at issue. Overall, the record evidence leaves me with doubts about \nhis eligibility for access to classified information. \n \n\nConclusion \n\n \n\n \nIn light of the circumstances presented by the record in this case, it is not clearly \nconsistent with the national interest to grant Applicant access to classified information. \nApplicant’s request for a security clearance is denied. \n \n \n\n \n\nFormal findings for or against Applicant on the allegations set forth in the SOR, \n\nFormal Findings \n\nParagraph 1, Guideline C (Foreign Preference) \n\n \n \nas required by section E3.1.25 of Enclosure 3 of the Directive, are: \n \n \n \n \n \n \n \n \n \n\nParagraph 2, Guideline B (Foreign Influence) \n\nSubparagraphs 1.a – 1.d: \n\nSubparagraphs 2.a – 2.h: \n\n \n\n \n\n \n\n \n\n \n\n \n\n AGAINST APPLICANT \n\n Against Applicant \n\n AGAINST APPLICANT \n\n Against Applicant \n\n____________________ \n\nFrancisco Mendez \nAdministrative Judge \n\n \n8 \n \n \n\n",,"MENDEZ, Francisco",Pro se,David F. Hayes
1,pdfs/12-01601.a1.pdf.txt,"KEYWORD: Guideline F\n\nDIGEST: Most of the document Applicant submitted on appeal were not previously submitted to\nthe Judge and the Board is prohibited from receiving or considering them. Adverse decision\naffirmed.\n\nCASENO: 12-01601.a1\n\nDATE: 09/23/2016\n\nIn Re:\n\n-------\n\nApplicant for Security Clearance\n\nDATE: September 23, 2016\n\nISCR Case No. 12-01601\n\n)\n)\n)\n)\n)\n)\n)\n)\n\nAPPEAL BOARD DECISION\n\nAPPEARANCES\n\nFOR GOVERNMENT\n\nJames B. Norman, Esq., Chief Department Counsel\n\nFOR APPLICANT\n\nPro se\n\nThe Department of Defense (DoD) declined to grant Applicant a security clearance. On\nAugust 19, 2015, DoD issued a statement of reasons (SOR) advising Applicant of the basis for that\ndecision–security concerns raised under Guideline F (Financial Considerations) of Department of\nDefense Directive 5220.6 (Jan. 2, 1992, as amended) (Directive). Applicant requested a decision\non the written record. On June 27, 2016, after considering the record, Defense Office of Hearings\nand Appeals (DOHA) Administrative Judge Martin H. Mogul denied Applicant’s request for a\nsecurity clearance. Applicant appealed pursuant to Directive ¶¶ E3.1.28 and E3.1.30.\n\nApplicant raised the following issues on appeal: whether the Judge erred in making findings\nof fact and whether the Judge’s decision was arbitrary, capricious, or contrary to law. Consistent\nwith the following, we affirm.\n\nThe Judge’s Findings of Fact \n\nApplicant is 46 years old, married, and has four children. He earned a bachelor’s degree in\n1994. Since 2010, he has been employed by a defense contractor. In his security clearance\napplication, he explained that he encountered difficultly paying bills in 2008 after being laid off\nfrom a second job due to the company’s bankruptcy. He gave no explanation for why he did not\nbegin attempting to resolve the debts until the issuance of the SOR.\n\nApplicant admitted each of the five delinquent debts alleged in the SOR. For the debt in\nSOR ¶ 1.a for about $500, Applicant wrote in his Response to Department Counsel’s File of\nRelevant Material (FORM) that the debt was paid off. The creditor agreed to accept three monthly\npayments of about $100 to settle the debt. A document showed the creditor made arrangements with\nApplicant to debit electronically about $100 from his account in September 2015 for this debt. For\nthe 2009 judgment in SOR ¶ 1.b for about $14,000, Applicant indicated that, at the time he answered\nthe SOR, he was unable to negotiate a settlement. In his Response to the FORM, he provided\ndocuments showing his wages were being garnished biweekly for the next six months for this debt\nand about $350 was garnished from his pay at the end of November 2015. No evidence was\npresented to establish the total amount deducted from his pay or how much is still owed on that debt. \nFor the debts in SOR ¶¶ 1.c, 1.b, and 1.c for about $2,500, $5,600, and $3600, Applicant wrote in\nhis Response to the FORM that he is currently on a schedule to repay between $150 and $190 a\nmonth on each debt until it is resolved. He provided proof of payments toward those debts from\nSeptember through November 2015.\n\nThe Judge’s Analysis\n\nThe Judge found Applicant did not act responsibly because, even though he has been\nemployed by his current employer since 2010, he only began making payments towards the debts\nafter the SOR was issued. The Judge also concluded that, while the mitigation condition concerning\nthe initiation of a good-faith effort to repay the creditors was applicable, it was not controlling\nbecause Applicant must establish a consistent history of continuing to resolve his debts. \n\nDiscussion\n\n2\n\nIn the appeal brief, Applicant provided character reference letters and documents showing\npayments on, or settlement of, various debts. Most of those documents, however, were not\npreviously submitted to the Judge and constitute new evidence that the Appeal Board is prohibited\nfrom receiving or considering. Directive ¶ E3.1.29. \n\nApplicant claims the Judge erred in the findings of fact. For example, he states that he has\nsix children (instead of four) and started working for his current employer in 2012 (instead of 2010). \nIn his security clearance application, however, Applicant listed that he had four children and\nindicated he started working for his employer in 2010. He failed to establish that the Judge erred\nin the findings of fact. Our review reveals the Judge’s material findings are based upon substantial\nevidence or constitute reasonable inferences or conclusions that could be drawn from the record\nevidence. See, e.g., ISCR Case No. 12-03420 at 3 (App. Bd. July 25, 2014). Applicant also argues\nthat he mitigated the security concerns arising from his debts. His arguments are not sufficient to\nshow that the Judge weighed the evidence in a manner that was arbitrary, capricious, or contrary to\nlaw. See, e.g., ISCR Case No. 14-06634 at 2 (App. Bd. Apr. 28, 2016). \n \n\nThe Judge examined the relevant data and articulated a satisfactory explanation for the\ndecision. The decision is sustainable on this record. “The general standard is that a clearance may\nbe granted only when ‘clearly consistent with the interests of the national security.’” Department\nof the Navy v. Egan, 484 U.S. 518, 528 (1988). See also Directive, Enclosure 2 ¶ 2(b): “Any doubt\nconcerning personnel being considered for access to classified information will be resolved in favor\nof the national security.”\n\nThe Decision is AFFIRMED. \n\nOrder\n\nSigned: Michael Ra’anan \nMichael Ra’anan\nAdministrative Judge\nChairperson, Appeal Board\n\nSigned: James E. Moody \nJames E. Moody\nAdministrative Judge\nMember, Appeal Board\n\nSigned; James F. Duffy \nJames F. Duffy\nAdministrative Judge\nMember, Appeal Board\n\n3\n\n",,,,
2,pdfs/11-03073.h1.pdf.txt,"\n\n DEPARTMENT OF DEFENSE \n\n DEFENSE OFFICE OF HEARINGS AND APPEALS \n\n \n \nIn the matter of: \n \n \n \nApplicant for Security Clearance \n\n \n\n \n\nISCR Case No. 11-03073 \n\n \n \n\n) \n) \n) \n) \n) \n \n \n\n \n \n\nAppearances \n\n______________ \n\n \nDecision \n\n______________ \n\n \n\n \n\n \n\n \n \n\nFor Government: Robert J. Kilmartin, Esq., Department Counsel \n\nFor Applicant: Mark S. Zaid, Esq. \n\nLOUGHRAN, Edward W., Administrative Judge: \n\n \nApplicant mitigated the financial considerations security concerns. Eligibility for \n\naccess to classified information is granted. \n \n\nStatement of the Case \n\nOn October 28, 2014, the Department of Defense (DOD) issued a Statement of \nReasons (SOR) to Applicant detailing security concerns under Guideline F, financial \nconsiderations. The action was taken under Executive Order (EO) 10865, Safeguarding \nClassified Information within Industry (February 20, 1960), as amended; DOD Directive \n5220.6, Defense Industrial Personnel Security Clearance Review Program (January 2, \n1992), as amended (Directive); and the adjudicative guidelines (AG) implemented by \nthe DOD on September 1, 2006. \n\n \nApplicant responded to the SOR on November 14, 2014, and requested a \nhearing before an administrative judge. The case was originally assigned to me on \nFebruary 27, 2015. Scheduling of the case was delayed because Applicant was working \noverseas. The case was reassigned to me on May 4, 2015. After coordinating with the \nparties, the Defense Office of Hearings and Appeals (DOHA) issued a notice of hearing \n\n \n\n \n1 \n\non July 16, 2015, scheduling the hearing for August 18, 2015. The hearing was \nconvened as scheduled. DOHA received the hearing transcript (Tr.) on August 26, \n2015. \n \n\nEvidentiary Rulings \n\nGovernment Exhibits (GE) 1, 2, 4, 5, and 6 were admitted in evidence without \nobjection. GE 3 was admitted over Applicant’s objection. Applicant testified, called five \nwitnesses, and submitted Applicant’s Exhibits (AE) A through Z, which were admitted \nwithout objection. The record was held open for Applicant to submit additional \ninformation. He submitted documents that I have marked AE AA and BB1 and admitted \nwithout objection. \n \n\nFindings of Fact \n\n \n\n \n\n \n \nApplicant is a 57-year-old self-employed contractor for a defense contractor. He \nserved in the U.S. military from 1979 until he retired in 2000. He seeks to retain a \nsecurity clearance, which he has held since he was in the military. He attended college \nfor a period, but he is a few credits shy of a degree. His first marriage ended in divorce. \nHe has been married to his current spouse for more than 25 years. He has two adult \nchildren.2 \n \n \nApplicant spent the majority of his military career in special operations. He went \non numerous combat missions, conducted clandestine insertions, and operated \nundercover. When he retired he became an entrepreneur and started his own \ncompanies. His primary company was incorporated, but Applicant had to personally \nguarantee many of the company’s liabilities. He also invested in real estate. His primary \ncompany suffered significant setbacks during the recession and housing crisis of the \nlater part of the 2000s. His real estate properties lost much of their value. He closed his \nprimary company in 2009 and has been working since then as a self-employed \ncontractor and subcontractor for other companies. He has spent much of the last five \nyears working overseas in dangerous assignments.3 \n \n \nThe SOR alleges nine delinquent debts, which include a $58,161 judgment (SOR \n¶ 1.a), mortgage loans (SOR ¶ 1.c - $30,994; SOR ¶ 1.d - $351,562; SOR ¶ 1.e - \n$271,900; and SOR ¶ 1.f - $50,688), two business-related debts (SOR ¶ 1.b - $18,028 \nand SOR ¶ 1.h - $77,115) and federal taxes (SOR ¶ 1.g - $33,882 for tax year 2008, \nand SOR ¶ 1.i – no amount alleged for tax years 2012 and 2013). Applicant admitted \nowing most of the debts at some point, but several of the debts were paid, settled, or \notherwise resolved. \n \n\n \n1 Counsel marked the post-hearing submissions as Z and AA, but AE Z was already admitted at the \nhearing. \n \n2 Tr. at 118, 123-127, 184; GE 1, 3 AE A. \n \n3 Tr. at 28-31, 120, 127-131; GE 1, 3; AE A, T. \n\n \n2 \n\n \nApplicant’s primary company was growing, and it leased a larger warehouse in \nabout 2008. The business ultimately failed, and the company broke the lease. The \nowner of the warehouse sued Applicant personally and obtained a $58,161 judgment \nagainst him (SOR ¶ 1.a). Applicant’s attorney is negotiating a settlement on this \njudgment.4 \n \n \nApplicant and a partner had a falling out over the business. The ex-partner sued \nApplicant for $160,000. They reached a settlement through arbitration in which \nApplicant agreed to pay the ex-partner $163,000 by 2010. He paid about $86,000, but \nhe was unable to pay the remainder when the company closed. The partner either \nobtained a judgment for $77,000 or sought to enforce the arbitrated settlement. \nApplicant’s attorney is negotiating a settlement for the remainder owed to the ex-\npartner.5 \n \nIn February 2013, Applicant settled the $20,302 business credit card debt alleged \n \nin SOR ¶ 1.b (alleged as $18,028 in SOR) for $10,151, which was paid in February \n2013. Applicant submitted proof in his response to DOHA interrogatories in May 2014 \nthat the debt had been settled. The February 2014 credit report lists the account with a \nzero balance, with the annotation that it was transferred, sold, and paid.6 \n \n \nApplicant was due a refund from his 2007 federal income taxes, but he owed the \nIRS for tax year 2008. The IRS filed a $33,882 tax lien against Applicant in June 2010 \n(SOR ¶ 1.g). In January 2011, the IRS applied his $15,892 refund from his 2007 taxes \nto his 2008 taxes. Applicant paid the remaining $5,689 owed for 2008 in January 2013. \nThe 2014 credit reports obtained by the Government report the lien as released in \nJanuary 2013.7 \n \n \nApplicant owed the IRS $65,863 for his 2012 and 2013 federal income taxes. In \nApril 2015, he submitted a proposed installment agreement to the IRS whereby he \nwould pay an initial $915 and then $500 per month until paid. In July 2015, the IRS \naccepted Applicant’s installment agreement and added his 2014 taxes. On June 5, \n2015, Applicant paid the IRS $56,415 for his 2014 taxes. On July 9, 2015, he paid \n$10,000 toward his estimated taxes for 2015.8 \n \n \nApplicant used to own five properties consisting of his residence and four rental \nproperties. Some of his tenants did not pay their rent, and he also had a property vacant \nfor an extended period. Applicant bought his home and another property when he was \nin the military. He still has those properties, plus another, but he lost two properties to \nforeclosure. SOR ¶¶ 1.d and 1.f allege the first ($351,562) and second ($50,688) \n \n4 Tr. at 132-133, 173-175, 185-186; Applicant’s response to SOR; GE 3; AE X, BB. \n \n5 Tr. at 176-177, 188-190; Applicant’s response to SOR; GE 1, 3; AE BB. \n \n6 Tr. at 151, 177-178; Applicant’s response to SOR; GE 3-6; AE M, O. \n \n7 Tr. at 134-140; Applicant’s response to SOR; GE 1, 3; AE Q. \n \n8 Tr. at 127, 140-148; Applicant’s response to SOR; GE ; AE R, S, V. \n\n \n\n \n3 \n\nmortgage loans on one of the foreclosed properties. The $351,562 figure is listed on the \ncredit reports as the high credit on the first mortgage loan, not a balance. The 2010 \ncredit report lists the account before foreclosure as $8,609 past due, with a $274,274 \nbalance. The more recent credit reports list the first mortgage loan as foreclosed with a \nzero balance. TransUnion reported in August 2015: “Credit grantor reclaimed collateral \nto settle defaulted mortgage[.] Foreclosure proceedings started[.]”9 \n \nThe second mortgage loan is listed on the three oldest credit reports with a \n \n$50,688 balance. Equifax reported a $50,688 balance in the combined credit report \nfrom August 2015. TransUnion reported the loan as closed with a zero balance. \nApplicant’s online account snapshot of the second mortgage loan shows a zero \nbalance. Applicant stated that he thought the first and second mortgage loans were \nresolved by the foreclosure. He has never been contacted by the holders of the first and \nsecond mortgage loans seeking a deficiency owed on the loans.10 \n \n \nSOR ¶ 1.e alleges the mortgage loan ($271,900) on the second foreclosed \nproperty. The $271,900 figure is listed on the credit reports as the high credit on the \nmortgage loan, not a balance. The reports list a zero balance for the loan. The 2010 and \n2015 credit reports list the account as “Foreclosure redeemed,” and “Credit grantor \nreclaimed collateral to settle defaulted mortgage[.]” Applicant has never been contacted \nabout a deficiency owed on the loan.11 \n \n \nSOR ¶ 1.c alleges the $30,994 charged-off second mortgage loan on Applicant’s \nhome. He settled the debt for $7,517, which he paid in August 2015. Applicant is current \non the first mortgage loan on his residence and the mortgage loans on his two \nremaining investment properties.12 \n \n \nApplicant paid other debts that were not alleged in the SOR. He repaid a $10,000 \nloan from his former commanding officer. He borrowed $50,000 from a friend who he \nserved with in the military in an effort to maintain his business. He paid the friend and \nthen his friend’s widow after his friend passed away. Applicant made the last payment of \n$14,000 in August 2015.13 \n \n \nApplicant’s accountant, friends, and business associates recommended that he \nfile bankruptcy to discharge the debts accrued from his failed business and the real \nestate collapse. He chose not to file bankruptcy because he wanted to pay his debts, \nand he was afraid that a bankruptcy would adversely affect his security clearance. He \nhas been working overseas under dangerous conditions during the last five years to \n\n \n9 Tr. at 151, 186-187; Applicant’s response to SOR; GE 3-6; AE O. \n \n10 Tr. at 151-156, 188; Applicant’s response to SOR; GE 4-6; AE O, P. \n \n11 Tr. at 151, 187-188; Applicant’s response to SOR; GE 4-6; AE O. \n \n12 Tr. at 156-163, 179; Applicant’s response to SOR; GE ; AE AA, BB. \n \n13 Tr. at 65, 170-173, 192-193; Applicant’s response to SOR; GE 3-6; AE O, Z. \n\n \n4 \n\n \n\nearn the money that has enabled him to pay his debts. He credibly testified that it will \ntake time, but he is committed to resolving all his debts.14 \n \n \nApplicant’s character evidence was extraordinary. He submitted numerous letters \nand several witnesses testified. Many of the authors and witnesses served with \nApplicant in special operations. His former commanding officer testified that he served \nwith Applicant under conditions that the commanding officer modesty described as \n“quite stressful.” The endorsements of the witnessses and authors were unequivocal \nand exceptional.15 \n \n\nPolicies \n\n \n\nWhen evaluating an applicant’s suitability \n\n \nthe \nadministrative judge must consider the adjudicative guidelines. In addition to brief \nintroductory explanations for each guideline, the adjudicative guidelines list potentially \ndisqualifying conditions and mitigating conditions, which are to be used in evaluating an \napplicant’s eligibility for access to classified information. \n \n\nfor a security clearance, \n\nThese guidelines are not inflexible rules of law. Instead, recognizing the \ncomplexities of human behavior, administrative judges apply the guidelines in \nconjunction with the factors listed in the adjudicative process. The administrative judge’s \noverarching adjudicative goal is a fair, impartial, and commonsense decision. According \nto AG ¶ 2(c), the entire process is a conscientious scrutiny of a number of variables \nknown as the “whole-person concept.” The administrative judge must consider all \navailable, reliable information about the person, past and present, favorable and \nunfavorable, in making a decision. \n\n \nThe protection of the national security is the paramount consideration. AG ¶ 2(b) \nrequires that “[a]ny doubt concerning personnel being considered for access to \nclassified information will be resolved in favor of national security.” \n\n \nUnder Directive ¶ E3.1.14, the Government must present evidence to establish \ncontroverted facts alleged in the SOR. Under Directive ¶ E3.1.15, the applicant is \nresponsible for presenting “witnesses and other evidence to rebut, explain, extenuate, \nor mitigate facts admitted by the applicant or proven by Department Counsel.” The \napplicant has the ultimate burden of persuasion to obtain a favorable security decision. \n\nthe Government predicated upon \n\n \nA person who seeks access to classified information enters into a fiduciary \n \nrelationship with \ntrust and confidence. This \nrelationship transcends normal duty hours and endures throughout off-duty hours. The \nGovernment reposes a high degree of trust and confidence in individuals to whom it \ngrants access to classified information. Decisions include, by necessity, consideration of \nthe possible risk the applicant may deliberately or inadvertently fail to safeguard \nclassified information. Such decisions entail a certain degree of legally permissible \n \n14 Tr. at 50, 180-181, 190-193, 197; Applicant’s response to SOR; GE 3 ; AE A, U, W, Y. \n \n15 Tr. at 25-28, 50-51, 62-74; AE B-L, T. \n\n \n\n \n5 \n\nextrapolation of potential, rather than actual, risk of compromise of classified \ninformation. \n \n\nSection 7 of EO 10865 provides that adverse decisions shall be “in terms of the \nnational interest and shall in no sense be a determination as to the loyalty of the \napplicant concerned.” See also EO 12968, Section 3.1(b) (listing multiple prerequisites \nfor access to classified or sensitive information). \n \n\n \nGuideline F, Financial Considerations \n \n\nAnalysis \n\nThe security concern for financial considerations is set out in AG ¶ 18: \n\nFailure or inability to live within one’s means, satisfy debts, and meet \nfinancial obligations may indicate poor self-control, lack of judgment, or \nunwillingness to abide by rules and regulations, all of which can raise \nquestions about an individual’s reliability, trustworthiness and ability to \nprotect classified \nfinancially \noverextended is at risk of having to engage in illegal acts to generate \nfunds. \n \nThe guideline notes several conditions that could raise security concerns under \n\ninformation. An \n\nindividual who \n\nis \n\n(a) the behavior happened so long ago, was so infrequent, or occurred \nunder such circumstances that it is unlikely to recur and does not cast \ndoubt on the individual’s current reliability, trustworthiness, or good \njudgment; \n\n(b) the conditions that resulted in the financial problem were largely \nbeyond the person’s control (e.g., loss of employment, a business \ndownturn, unexpected medical emergency, or a death, divorce or \nseparation), and the individual acted responsibly under the circumstances; \n\n \n6 \n\nAG ¶ 19. The following are potentially applicable in this case: \n\n \n(a) inability or unwillingness to satisfy debts; and \n\n \n\n(c) a history of not meeting financial obligations. \n\n \n \n \nevidence is sufficient to raise the above disqualifying conditions. \n \n \nprovided under AG ¶ 20. The following are potentially applicable: \n \n\nApplicant had delinquent debts that he was unable or unwilling to pay. The \n\nConditions that could mitigate the financial considerations security concerns are \n\n \n\n \n\n \n\n \n\n(c) the person has received or is receiving counseling for the problem \nand/or there are clear indications that the problem is being resolved or is \nunder control; and \n\n(d) the individual initiated a good-faith effort to repay overdue creditors or \notherwise resolve debts. \n\n \nApplicant owed and still owes a lot of money, but the SOR overstated his debts. \n \nThe evidence available to the Government (credit reports and documents submitted by \nApplicant in response to interrogatories) established that before the SOR was issued, \nApplicant settled the $18,028 debt alleged in SOR ¶ 1.b, and he paid the IRS the \n$33,882 alleged in SOR ¶ 1.g. Additionally, the SOR alleged the high credit on the two \nmortgage loans (SOR ¶¶ 1.d - $351,562 and 1.e - $271,900), when the balances on \nthose loans were reported as zero. \n \n \nApplicant paid his 2008 federal income taxes. He has a payment plan in place for \nhis 2012 and 2013 tax years. He settled and paid the second mortgage loan on his \nhome, and he paid other debts that were not alleged in the SOR. His attorney is \nnegotiating settlements for the two judgments against him. Applicant credibly testified \nthat it will take time, but he is committed to resolving all his debts. \n \n \nI find that Applicant established a plan to resolve his financial problems, and he \ntook significant action to implement that plan. He acted responsibly and made a good-\nfaith effort to pay his debts. There are clear indications that his financial problems are \nbeing resolved and are under control. They occurred under circumstances that are \nunlikely to recur and do not cast doubt on his current reliability, trustworthiness, and \ngood judgment. AG ¶¶ 20(c) and 20(d) are applicable. AG ¶ 20(b) is partially applicable. \nAG ¶ 20(a) is not yet completely applicable because Applicant is still in the process of \npaying his debts. \n \nWhole-Person Concept \n \nUnder the whole-person concept, the administrative judge must evaluate an \n \napplicant’s eligibility for a security clearance by considering the totality of the applicant’s \nconduct and all relevant circumstances. The administrative judge should consider the \nnine adjudicative process factors listed at AG ¶ 2(a): \n \n\n \n\n \n\n \n\nthe conduct, \n\nthe nature, extent, and seriousness of \nto \n\nthe conduct; (2) \nthe \n(1) \ncircumstances surrounding \ninclude knowledgeable \nparticipation; (3) the frequency and recency of the conduct; (4) the \nindividual’s age and maturity at the time of the conduct; (5) the extent to \nwhich participation \nthe presence or absence of \nrehabilitation and other permanent behavioral changes; (7) the motivation \nfor the conduct; (8) the potential for pressure, coercion, exploitation, or \nduress; and (9) the likelihood of continuation or recurrence. \n\nis voluntary; (6) \n\n \n7 \n\n \nUnder AG ¶ 2(c), the ultimate determination of whether to grant eligibility for a \nsecurity clearance must be an overall commonsense judgment based upon careful \nconsideration of the guidelines and the whole-person concept. \n \n\nI considered the potentially disqualifying and mitigating conditions in light of all \nthe facts and circumstances surrounding this case. I have incorporated my comments \nunder Guideline F in my whole-person analysis. \n\n \nApplicant’s character evidence was extraordinary. His financial problems are \nlarge, but not insurmountable. Applicant credibly testified that he will eventually resolve \nall his debts. His sacrifices in service to this country have earned him the time to do so. \n\n \nOverall, the record evidence leaves me without questions or doubts as to \nApplicant’s eligibility and suitability for a security clearance. I conclude Applicant \nmitigated the financial considerations security concerns. \n \n\nFormal findings for or against Applicant on the allegations set forth in the SOR, \n\nFormal Findings \n\n \n \nas required by section E3.1.25 of Enclosure 3 of the Directive, are: \n \n \n \n \n \n\nSubparagraphs 1.a-1.i: \n \n\nParagraph 1, Guideline F: \n\nFor Applicant \n\nFor Applicant \n\n \n\n \n\n \n \n\n \n\n \n\nConclusion \n\n \n\n \nIn light of all of the circumstances presented by the record in this case, it is \nclearly consistent with the national interest to continue Applicant’s eligibility for a \nsecurity clearance. Eligibility for access to classified information is granted. \n \n \n \n\n________________________ \n\nEdward W. Loughran \nAdministrative Judge \n\n \n\n \n8 \n\n",granted,"LOUGHRAN, Edward W.","Mark S. Zaid, Esq.",Robert J. Kilmartin
3,pdfs/11-04909.h1.pdf.txt,"\n\n DEFENSE OFFICE OF HEARINGS AND APPEALS \n\n DEPARTMENT OF DEFENSE \n\n \n \n\n \n\n \nIn the matter of: \n \n \n \nApplicant for Security Clearance \n\n \n \n\n \n\nISCR Case No. 11-04909 \n\n \n\nFor Government: Richard Stevens, Esq., Department Counsel \n\nFor Applicant: Pro se \n\n \n\n \nDUFFY, James F., Administrative Judge: \n\n \nApplicant mitigated \n\nconsiderations). Clearance is granted. \n\nthe security concerns under Guideline F (financial \n\nStatement of the Case \n\nOn April 5, 2015, the Department of Defense (DOD) Consolidated Adjudications \nFacility (CAF) issued Applicant a Statement of Reasons (SOR) detailing security \nconcerns under Guideline F. DOD CAF took that action under Executive Order 10865, \nSafeguarding Classified Information Within Industry, dated February 20, 1960, as \namended; DOD Directive 5220.6, Defense Industrial Personnel Security Clearance \nReview Program, dated January 2, 1992, as amended (Directive); and the adjudicative \nguidelines (AG) implemented by DOD on September 1, 2006. \n\nThe SOR detailed reasons why DOD adjudicators could not make the affirmative \nfinding under the Directive that it is clearly consistent with the national interest to grant \nApplicant a security clearance. On April 30, 2015, Applicant answered the SOR and \nrequested a hearing. The case was assigned to me on August 26, 2015. The Defense \n\nAppearances \n\n__________ \n\n \nDecision \n__________ \n\n) \n) \n) \n) \n) \n \n \n\n \n \n\n \n\n \n\n \n\n \n1 \n \n \n\nOffice of Hearings and Appeals (DOHA) issued a Notice of Hearing on September 3, \n2015, and the hearing was convened as scheduled on September 22, 2015. \n\n \nAt the hearing, Department Counsel offered Government Exhibits (GE) 1 through \n4. Applicant testified and submitted Applicant Exhibits (AE) A through F. The record of \nthe proceedings was left open until October 5, 2015, to provide Applicant the \nopportunity to submit additional documents. He timely submitted documents that were \nmarked as AE G thought I. All exhibits were admitted into evidence without objections. \nDOHA received the hearing transcript (Tr.) on September 30, 2015. \n \n\nProcedural Matters \n\n \n\n \nDepartment Counsel made a motion to withdraw the allegations in SOR ¶¶ 1.d \nand 1.e. Applicant had no objection to the motion. The motion was granted, and those \nallegations were withdrawn.1 \n \n \nDepartment Counsel made a motion to amend SOR ¶ 1.a to reflect that the debt \nwas owed to a state instead of the Internal Revenue Service (IRS). Applicant had no \nobjection to the amendment. The motion to amend was granted.2 \n \n\nFindings of Fact \n\n \n \nApplicant is a 40-year-old mechanic who works as a quality assurance inspector \nfor a defense contractor. He has been working for that contractor in the Middle East \nsince May 2011 and has worked overseas for other defense contractors for a number of \nyears. He earned a general educational development certificate in 1996 and an \nassociate’s degree in 2003. He has been married twice. He married his current wife in \n2005. He has three children, ages 15, 16, and 22.3 \n \n\nExcluding the withdrawn allegations, the SOR alleged that Applicant had three \ndelinquent debts totaling $64,168 (SOR ¶¶ 1.a-1.c). In his Answer to the SOR, he \ndenied each debt. Substantial evidence of the alleged debts is contained in GE 1-4.4 \n\n \nIn 1999, Applicant was involved in a car accident and incurred a broken pelvis \nand hip. He was determined to be fully disabled and collected about $1,000 per month \nin disability payments from the Social Security Administration (SSA). While receiving the \ndisability benefits, he knew there was a limit on how much income he could earn, but \ndid not know the exact restrictions. In 2006, he received a letter from SSA that indicated \n \n\n \n\n \n\n \n\n \n\n1 Tr. 23-25. \n\n2 Tr. 43-44. \n\n3 Tr. 5-7, 23, 37-39, 42-43; GE 1. \n\n4 GE 1-4; Applicant’s Answer to the SOR. \n\n \n2 \n \n \n\nhe was earning more income than he was allowed. At that time, his disability payments \nstopped, and he has not received any further disability payments since then.5 \n\n \nA credit report dated June 12, 2010, reflected that Applicant’s SSA account had \nbeen placed for collection in the amount of $20,955 (SOR ¶ 1.c). This delinquent \naccount was the result of his overpayment of disability payments. In 2008, he started \nrepaying this debt through a monthly pay allotment of $300. When he left his job later \nthat year, the allotment stopped. In 2012, the SSA began garnishing his pay at a rate of \n$600 every two weeks. By early 2014 (well before the issuance of the SOR), he fully \nrepaid this debt. In July 2014, he received a letter from the SSA stating he overpaid the \ndebt and would be receiving a refund of $2,535.6 \n\n \nIn 2008 and 2009, Applicant worked overseas, and a portion of his income was \nexempt from taxation. In those years, his income was greater than the exemption, and \nhe incurred taxes that he could not pay. In his security clearance application (SCA), he \ndisclosed that he owed approximately $36,000 to the IRS for 2008 and 2009 (SOR ¶ \n1.b). In 2010, his state also filed a $7,213 tax lien against him (SOR ¶ 1.a).7 \n\n \nIn his Answer to the SOR, Applicant stated that the state tax lien had been repaid \nthrough monthly payroll deductions. He also stated that he had a repayment plan with \nthe IRS and the payments are automatically deducted from his pay.8 \n\n \nAt the hearing, Applicant testified that the state tax lien was repaid, but he had \nnot yet received a document from the state showing it was paid. He stated that the \ndeductions from his pay for the state tax lien stopped over a year ago. At the hearing, \nhe also provided IRS account transcripts for 2008 and 2009 that showed he had been \nconsistently making monthly payments of $375 since August 2012. As of September \n2015, his IRS account balance for 2008 was $3,961 and for 2009 was $17,124. The \naccount transcripts showed that his IRS debt had been cut almost in half.9 \n\n \nBesides the debts alleged in the SOR, Applicant’s credit report dated March 26, \n2015, reflected that he had no other delinquent debts. He also testified that he had no \nother delinquent debts. In 2014, he earned about $108,000 and his wife earned about \n$25,000. In his post-hearing submission, Applicant presented letters from the state that \nreflected he did not have an outstanding tax liability for 2008 and 2009.10 \n\n \n\n5 Tr. 26-29; GE 1. \n \n6 Tr. 29-32; GE 1-3; AE C. \n \n7 Tr. 32-36; GE 1. \n\n \n\n \n\n \n\n \n\n8 Applicant’s Answer to the SOR. \n\n9 Tr. 32-36, 46-51; AE D-F. \n\n10 Tr. 39-42; AE A, B, H, I; Applicant’s Answer to the SOR. \n\n \n3 \n \n \n\nPolicies \n\n \n\nThe President of the United States has the authority to control access to \ninformation bearing on national security and to determine whether an individual is \nsufficiently trustworthy to have access to such information. Department of the Navy v. \nEgan, 484 U.S. 518, 527 (1988). The President has authorized the Secretary of \nDefense to grant eligibility for access to classified information “only upon a finding that it \nis clearly consistent with the national interest to do so.” Exec. Or. 10865, Safeguarding \nClassified Information within Industry § 2 (Feb. 20, 1960), as amended. The U.S. \nSupreme Court has recognized the substantial discretion of the Executive Branch in \nregulating access to information pertaining to national security, emphasizing that “no \none has a ‘right’ to a security clearance.” Department of the Navy v. Egan, 484 U.S. \n518, 528 (1988). \n\n \nEligibility for a security clearance is predicated upon the applicant meeting the \ncriteria contained in the adjudicative guidelines. These AGs are not inflexible rules of \nlaw. Instead, recognizing the complexities of human behavior, these guidelines are \napplied in conjunction with an evaluation of the whole person. An administrative judge’s \nadjudicative goal is a fair, impartial, and commonsense decision. An administrative \njudge must consider all available, reliable information about the person, past and \npresent, favorable and unfavorable, in reaching a decision. \n\n \nThe Government reposes a high degree of trust and confidence in persons with \naccess to classified information. This relationship transcends normal duty hours and \nendures throughout off-duty hours. Decisions include, by necessity, consideration of the \npossible risk that the applicant may deliberately or inadvertently fail to safeguard \nclassified information. Such decisions entail a certain degree of legally permissible \nextrapolation of potential, rather than actual, risk of compromise of classified \ninformation. Clearance decisions must be “in terms of the national interest and shall in \nno sense be a determination as to the loyalty of the applicant concerned.” See Exec. \nOr. 10865 § 7. See also Executive Order 12968 (Aug. 2, 1995), Section 3. Thus, a \nclearance decision is merely an indication that the Applicant has or has not met the \nstrict guidelines the President and the Secretary of Defense have established for issuing \na clearance. \n\n \nInitially, the Government must establish, by substantial evidence, conditions in \nthe personal or professional history of the applicant that may disqualify the applicant \nfrom being eligible for access to classified information. The Government has the burden \nof establishing controverted facts alleged in the SOR. See Egan, 484 U.S. at 531. \n“Substantial evidence” is “more than a scintilla but less than a preponderance.” See v. \nWashington Metro. Area Transit Auth., 36 F.3d 375, 380 (4th Cir. 1994). The guidelines \npresume a nexus or rational connection between proven conduct under any of the \ncriteria listed and an applicant’s security suitability. See ISCR Case No. 95-0611 at 2 \n(App. Bd. May 2, 1996). \n\n \n\n \n4 \n \n \n\nOnce the Government establishes a disqualifying condition by substantial \nevidence, the burden shifts to the applicant to rebut, explain, extenuate, or mitigate the \nfacts. Directive ¶ E3.1.15. An applicant “has the ultimate burden of demonstrating that it \nis clearly consistent with the national interest to grant or continue [his or her] security \nclearance.” ISCR Case No. 01-20700 at 3 (App. Bd. Dec. 19, 2002). The burden of \ndisproving a mitigating condition never shifts to the Government. See ISCR Case No. \n02-31154 at 5 (App. Bd. Sep. 22, 2005). “[S]ecurity clearance determinations should err, \nif they must, on the side of denials.” Egan, 484 U.S. at 531; see AG ¶ 2(b). \n\nAnalysis \n\n \n\n \n\nGuideline F, Financial Considerations \n\n \nThe security concern for this guideline is set out in AG ¶ 18 as follows: \n\n \n\nFailure or inability to live within one’s means, satisfy debts, and meet \nfinancial obligations may indicate poor self-control, lack of judgment, or \nunwillingness to abide by rules and regulations, all of which can raise \nquestions about an individual’s reliability, trustworthiness and ability to \nprotect classified \nfinancially \noverextended is at risk of having to engage in illegal acts to generate \nfunds. \n \nThe guideline notes several conditions that could raise security concerns under \n\ninformation. An \n\nindividual who \n\nis \n\nAG ¶ 19. Two are potentially applicable in this case: \n\n \n(a) inability or unwillingness to satisfy debts; and \n\n(c) a history of not meeting financial obligations. \n\n \n \n \n \n \nThe evidence established that Applicant accumulated delinquent debts that he \nwas unable or unwilling to pay for an extended period. AG ¶¶ 19(a) and 19(c) apply in \nthis case. \n \n \napplicable: \n \n\nFour financial considerations mitigating conditions under AG ¶ 20 are potentially \n\n(a) the behavior happened so long ago, was so infrequent, or occurred \nunder such circumstances that it is unlikely to recur and does not cast \ndoubt on the individual’s current reliability, trustworthiness, or good \njudgment; \n \n(b) the conditions that resulted in the financial problem were largely \nbeyond the person’s control (e.g., loss of employment, a business \ndownturn, unexpected medical emergency, or a death, divorce or \nseparation), and the individual acted responsibly under the circumstances; \n\n \n5 \n \n \n\n(c) the person has received or is receiving counseling for the problem \nand/or there are clear indications that the problem is being resolved or is \nunder control; and \n \n(d) the individual initiated a good-faith effort to repay overdue creditors or \notherwise resolve debts. \n \nApplicant incurred delinquent debts because he continued to receive disability \nbenefits after his income exceeded earning limitations and because he failed to pay \ntaxes on income earned in 2008 and 2009. These events do not constitute \ncircumstances beyond his control. AG ¶ 20(b) does not apply. \n\n \nApplicant has resolved the state tax lien in SOR ¶ 1.a and the disability \noverpayment in SOR ¶ 1.c. He has established a repayment plan to resolve the \ndelinquent federal taxes in SOR ¶ 1.b. He has been consistently making payments \nunder that plan for the past three years. The record established that his financial \nproblems are under control and are being resolved. He has not incurred any recent \ndelinquent debts. His financial problems are unlikely to recur and do not cast doubt on \nhis current reliability, trustworthiness, and good judgment. AG ¶¶ 20(a), 20(c), and 20(d) \napply. \n\n \n\nWhole-Person Concept \n\nIn the adjudication process, an administrative judge must carefully weigh a \nnumber of variables known as the whole-person concept. Available information about \nthe applicant as well as the factors listed in AG ¶ 2(a) should be considered in reaching \na determination.11 In this case, I gave due consideration to the information about \nApplicant in the record and concluded the favorable information, including the mitigating \nevidence, outweigh the security concerns at issue. Applicant met his burden of \npersuasion to mitigate the security concerns. \n\nFormal findings as required by Section E3.1.25 of Enclosure 3 of the Directive \n\nare: \n\nFormal Findings \n\n \n\n \n\n11 The adjudicative process factors listed at AG ¶ 2(a) are as follows: \n\n(1) the nature, extent, and seriousness of the conduct; (2) the circumstances surrounding the \nconduct, to include knowledgeable participation; (3) the frequency and recency of the \nconduct; (4) the individual’s age and maturity at the time of the conduct; (5) the extent to \nwhich participation is voluntary; (6) the presence or absence of rehabilitation and other \npermanent behavioral changes; (7) the motivation for the conduct; (8) the potential for \npressure, coercion, exploitation, or duress; and (9) the likelihood of continuation or \nrecurrence. \n\n \n\n \n6 \n \n \n\nParagraph 1, Guideline F: \n \nSubparagraphs 1.a-1.c: \nSubparagraphs 1.d-1.e: \n \n\n \n \n\n For Applicant \n\nFor Applicant \nWithdrawn \n\nDecision \n\n \n\n \n \n\n \n\nIn light of all the circumstances presented by the record in this case, it is clearly \nconsistent with the national interest to grant Applicant’s eligibility for a security \nclearance. Clearance is granted. \n\n________________ \n\nJames F. Duffy \n\nAdministrative Judge \n\n \n7 \n \n \n\n",,"DUFFY, James F.",Pro se,Richard Stevens
4,pdfs/11-08313.h1.pdf.txt,"\n\n DEPARTMENT OF DEFENSE \n DEFENSE OFFICE OF HEARINGS AND APPEALS \n\n \n\nISCR Case No. 11-08313 \n\n \nIn the matter of: \n \n \n \nApplicant for Security Clearance \n\n--------------- \n \n\n \n\n \n \n\n) \n) \n) \n) \n) \n \n\n \n \n\nAppearances \n\n______________ \n\n \nDecision \n\n______________ \n\nFor Government: Julie R. Mendez, Esquire, Department Counsel \n\nFor Applicant: Pro se \n\n \n \n\n \n\nMARSHALL, Jr., Arthur E., Administrative Judge: \n\n \n Statement of the Case \n \nOn April 4, 2014, the Department of Defense (DOD) issued Applicant a \nStatement of Reasons (SOR) detailing security concerns under Guideline B (Foreign \nInfluence) and Guideline E (Personal Conduct).1 In a response signed April 28, 2014, \nApplicant admitted all allegations and requested a hearing based on the written record. \nOn August 20, 2015, the Government prepared a file of relevant material (FORM) which \nincluded nine attachments (“Items”). Applicant did not respond to the FORM. I was \nassigned the case on December 1, 2015. Based on a thorough review of the case file, I \nfind that Applicant failed to carry his burden in mitigating security concerns arising under \nboth Guideline B and Guideline E. \n\n Findings of Fact \n\n \nApplicant is a 71-year-old man who is presently not working while his security \nclearance is updated. He has earned a high school diploma. He was divorced in 1998 \n \n1 The action was taken under Executive Order 10865, Safeguarding Classified Information within Industry \n(February 20, 1960), as amended; Department of Defense Directive 5220.6, Defense Industrial Personnel \nSecurity Clearance Review Program (January 2, 1992), as amended (Directive); and the adjudicative \nguidelines (AG) effective within the DOD on September 1, 2006. \n\n1 \n\n \n\n \n \n \n\n \n\nand is now estranged from a subsequent spouse. Facts about this estranged female are \nunclear, although it appears she is from Russia, where she currently resides with or \nnear family. He provided incomplete information on his security clearance application \n(SCA) about his family and children. Other discrepancies appear throughout his \ninvestigatory record. Applicant presently lives in another country with a female foreign \nnational in order to reduce the costs he would otherwise expend living in the United \nStates while awaiting a return to work. \n \n \nAt issue in the SOR are the following facts: Applicant has a wife who is a citizen \nand resident of Russia.2 They are estranged, but Applicant has no plans to divorce his \nforeign wife because he no longer believes in divorce. He has a daughter who is a \ncitizen of the United States and is currently a resident of Russia, where she is a \nuniversity student. He also has parents-in-law who are citizens and residents of Russia. \nApplicant met them in about July 1998 on a visit to their home. He again saw them on a \nsubsequent trip to Russia with his wife. He has had no other contact with them and \ndoes not know how often his estranged wife currently has with her parents now that all \nthree are in Russia. His father-in-law and mother-in-law are a teacher and a \ngynecologist, respectively. Applicant assumed they were Communists because the \nCommunists were in power when he met them. As of 2015, Russia is one of the two \nleading state intelligence threats to United States interests, based on their capabilities, \nintent, and broad operational scopes.3 Applicant admits the Guideline B SOR \nallegations, but did not address them in any manner in his SOR Response. \n \n \nIn completing SCAs in both August 3, 2009, and March 23, 2013, Applicant failed \nto list multiple names he has used, places of former residence, periods of \nunemployment, relatives, and foreign contacts. He told an investigator in 2010 that he \nhad no problems while working with a particular employer, but later conceded he was \ninvolved in a harassment suit while working there. Despite a 2010 claim to the contrary, \nhe failed in 2007 to file an appropriate contact with foreign nationals report regarding the \nforeign woman with whom he currently cohabitates. Applicant admits these allegations. \n \n\n \n\nPolicies \n\n \n\n \n \n \n\nWhen evaluating an applicant’s suitability \n\n \n \nthe \nadministrative judge must consider the adjudicative guidelines. In addition to brief \nintroductory explanations for each guideline, the adjudicative guidelines list potentially \ndisqualifying conditions and mitigating conditions, which are used in evaluating an \napplicant’s eligibility for access to classified information. \n \n\nfor a security clearance, \n\nThese guidelines are not inflexible rules of law. Instead, recognizing the \ncomplexities of human behavior, these guidelines are applied in conjunction with the \n \n2 In some records within the FORM, Applicant indicated that his estranged wife became a naturalized \ncitizen in 2007 or 2008 and now resides in the United States, but the discrepancies are never resolved. \n(See FORM, Item 6) \n \n3 FORM, Item 9, at 3. \n\n2 \n\nfactors listed in the adjudicative process. The administrative judge’s overarching \nadjudicative goal is a fair, impartial, and commonsense decision. According to AG ¶ \n2(c), the entire process is a conscientious scrutiny of a number of variables known as \nthe “whole-person concept.” The administrative judge must consider all available, \nreliable information about the person in making a decision. \n\n \nThe protection of the national security is the paramount consideration. AG ¶ 2(b) \nrequires that “[a]ny doubt concerning personnel being considered for access to \nclassified information will be resolved in favor of national security.” In reaching this \ndecision, I have drawn only those conclusions that are reasonable, logical, and based \non the evidence contained in the record. \n\n \nUnder Directive ¶ E3.1.14, the Government must present evidence to establish \ncontroverted facts alleged in the SOR. Under Directive ¶ E3.1.15, an “applicant is \nresponsible for presenting witnesses and other evidence to rebut, explain, extenuate, or \nmitigate facts admitted by applicant or proven by Department Counsel and has the \nultimate burden of persuasion to obtain a favorable security decision.” \n\nthe Government predicated upon \n\n \nA person who seeks access to classified information enters into a fiduciary \nrelationship with \ntrust and confidence. This \nrelationship transcends normal duty hours and endures throughout off-duty hours. The \nGovernment reposes a high degree of trust and confidence in individuals to whom it \ngrants access to classified information. Decisions include, by necessity, consideration of \nthe possible risk the applicant may deliberately or inadvertently fail to safeguard \nclassified information. Section 7 of Executive Order 10865 provides that decisions shall \nbe “in terms of the national interest and shall in no sense be a determination as to the \nloyalty of the applicant concerned.” See also EO 12968, Section 3.1(b). \n\nThe security concern relating to the Guideline B is set out in AG ¶ 6: \n\nForeign contacts and interests may be a security concern if the individual \nhas divided loyalties or foreign financial interests, may be manipulated or \ninduced to help a foreign person, group, organization, or government in a \nway that is not in U.S. interests, or is vulnerable to pressure or coercion by \nany foreign interest. Adjudication under this Guideline can and should \nconsider the identity of the foreign country in which the foreign contact or \nfinancial \nto, such \nconsiderations as whether the foreign country is known to target United \nStates citizens to obtain protected information and/or is associated with a \nrisk of terrorism. \n\nincluding, but not \n\nlocated, \n\ninterest \n\nlimited \n\nis \n\n \n\n \n\n \n \n \n\n3 \n\nAnalysis \n\n \n\n \n\nGuideline B, Foreign Influence \n \n\n \n\nApplicant remains married to the female of Russian origin at issue. Therefore, it \ncan be assumed he maintains some level of affection for this woman, the mother of his \nchild. It can also be assumed he maintains a bond or ties of affection for their child, who \nis now studying at a Russian college. As for his in-laws, while their contact with \nApplicant is negligible, it can be assumed his estranged partner has ties of affection with \nher own parents. Those ties are routinely attributed to the spouse (Applicant) in these \ncases. Given these facts, disqualifying conditions AG ¶¶ 7(a) and (b) apply: \n \n\nto protect sensitive \n\nAG ¶ 7(a) contact with a foreign family member, business or professional \nassociate, friend, or other person who is a citizen of or resident in a \nforeign country if that contact creates a heightened risk of foreign \nexploitation, inducement, manipulation, pressure, or coercion; and \n \nAG ¶ 7(b) connection to a foreign person, group, government, or country \nthat create a potential conflict of interest between the individual’s \nobligation \nthe \nindividual’s desire to help a foreign person, group, or country by providing \nthat information. \n \nIn finding disqualifying conditions applicable, I specifically note that AG ¶ 7(a) \nrequires substantial evidence of a heightened risk. The heightened risk required to raise \na disqualifying condition is a relatively low standard. Heightened risk denotes a risk \ngreater than the normal risk inherent in having a family member living under a foreign \ngovernment or substantial assets in a foreign nation. Russia is one of the two leading \nstate intelligence threats to United States interests. In addition, foreign family ties can \npose a security risk even without a connection to a foreign government. This is because \nan applicant may be subject to coercion or undue influence when a third party pressures \nor threatens an applicant’s family members. Under these facts, while unlikely, a third \nparty coercion concern potentially exists in Russia. Therefore, there is sufficient \nevidence to raise the above disqualifying conditions. \n\ninformation or \n\ntechnology and \n\n \nAG ¶ 8 provides conditions that could mitigate security concerns. I considered all \n\nof the mitigating conditions under AG ¶ 8, and find the following are relevant: \n \n\nAG ¶ 8(a) the nature of the relationship with foreign persons, the country in \nwhich these persons are located, or the positions or activities of those \npersons in that country are such that it is unlikely the individual will be \nplaced in a position of having to choose between the interests of a foreign \nindividual, group, organization, or government and the interests of the U.S.; \nand \n\n \n\nAG ¶ 8(b) there is no conflict of interest, either because the individual’s \nsense of loyalty or obligation to the foreign person, group, government, or \ncountry is so minimal, or the individual has such deep and longstanding \nrelationships and loyalties in the U.S., that the individual can be expected to \nresolve any conflict of interests in favor of the U.S. interests. \n\n \n\n \n \n \n\n4 \n\nThe mere possession of close family ties to persons in a foreign country is not, \nas a matter of law, disqualifying under Guideline B. However, if only one relative lives in \na foreign country and an applicant has frequent, non-casual contacts with that relative, \nthis factor alone is sufficient to create the potential for foreign influence and could \npotentially result in the compromise of classified information. \n\n \nHere, Applicant provided scant information about his estranged wife, daughter, \nand in-laws. Indeed, information about his in-laws is more abundant than information \nabout his child and her mother. While his personal ties to his in-laws may be weak, their \nimpact on Applicant cannot be discerned without more information about his wife and \nchild. Under these limited facts, no mitigating conditions weigh in Applicant’s favor. \n\nThe security concern for personal conduct is set out in AG ¶ 15, where the \nsignificance of conduct involving questionable judgment, lack of candor, dishonesty, or \nunwillingness to comply with rules and regulations is defined ([p]ersonal conduct can \nraise questions about an individual’s reliability, trustworthiness and ability to protect \nclassified information). Of special interest is any failure to provide truthful and candid \nanswers during the security clearance process or any other failure to cooperate with the \nsecurity clearance process. \n\n \nIn completing his 2009 and 2013 SCAs, Applicant’s answers in many areas were \neither deficient or in conflict. His answers in a 2010 interview were either intentionally or \nnegligently incomplete with regard to past workplace problems and the reporting of a \nforeign cohabitant to his former employer. If these inaccuracies were intentionally \nincorrect, such facts could give rise to: \n\n \nAG ¶ 16(a) deliberate omission, concealment, or falsification of relevant \nfacts \nfrom any personnel security questionnaire, personal history \nstatement, or similar form used to conduct investigations, determine \nemployment qualifications, award benefits or status, determine security \nclearance eligibility or trustworthiness, or award fiduciary responsibilities, \nand \n \nAG ¶ 16(b) deliberately providing \ninformation \nconcerning relevant facts to an employer, investigator, security official, \ncompetent medical authority, or other official government representative. \n \nApplicant admitted all three allegations raised under Guideline E and provided no \nexplanations or commentary. While his admission to the SCA discrepancies could be \ndiscounted and his discrepancies found to be the product of negligence, this guideline \ncould be found in his favor. In admitting allegations that he knowingly withheld \ninformation and denied having had problems while employed at a place where he was \ninvolved in a harassment suit, however, the facts tend to indicate that these answers – \nwithout more – were intentionally false or misleading. This is particularly true in the \n5 \n\nfalse or misleading \n\n \n\nGuideline E, Personal Conduct \n \n\n \n\n \n \n \n\nabsence of some explanation claiming forgetfulness or another basis for having \nprovided a negligently entered SCA answer. Therefore, given his admissions, none of \nthe mitigating conditions at AG ¶ 17(a) – (g) apply. \n\nWhole-Person Concept \n \n \nUnder the whole-person concept, the administrative judge must evaluate an \napplicant’s eligibility for a security clearance by considering the totality of the applicant’s \nconduct and all relevant circumstances. The administrative judge should consider the \nnine adjudicative process factors listed at AG ¶ 2(a). Some of the factors in AG ¶ 2(a) \nwere addressed under those guidelines, but some warrant additional comment. \n \n \nApplicant failed to do more than admit the allegations raised in the SOR. He then \nrequested a judgment based on the written record. It is that very record that is full of \ndiscrepancies and contradictions that contributed to the security concerns at issue. \nThere are simply insufficient facts to evaluate Applicant’s foreign kin, the SCA and \ninterview inaccuracies, or even the Applicant as an individual. This is simply the result of \na deficiently supplemented record. Without more, there is insufficient information to \nrebut, refute, or mitigate the security concerns raised under the foreign influence and \npersonal conduct guidelines. \n\n \n\n \n\nFOREIGN INFLUENCE \n\n \n\n \n\nAllegations 1.a-1.c: \n\nPERSONAL CONDUCT \n \n \n\nAllegations 2.a-2.c: \n\n \n\nFormal Findings \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nAGAINST APPLICANT \n\nAgainst Applicant \n\nAGAINST APPLICANT \n\nAgainst Applicant \n\n \n \n \n \n \n\n \n \n \n \n\n \n\n \n\n \n\n \n\n Conclusion \n\n \n\n \nIn light of all of the circumstances presented by the record in this case, it is not \nclearly consistent with the national interest to grant Applicant a security clearance. \nEligibility for access to classified information is denied. \n \n \n \n\n \n\n \n\n \n\n \n\n \n_____________________________ \n\n \n\nArthur E. Marshall, Jr. \nAdministrative Judge \n\n \n\n \n \n \n\n6 \n\n",denied,,Pro se,Julie R. Mendez


# Reading books

When you're doing text work, you're legally obligated work on Jane Austen's Pride and Prejudice (at least I *think* so). Let's do some naive analysis of it!

## Read in Jane Austen's Pride and Prejudice (without moving the file!)

It's in the `data/` directory, and named `Austen_Pride.txt`.

In [33]:
austen = open("data/Austen_Pride.txt", encoding="utf8").read()
df4 = pd.DataFrame({'austen': [austen]})


## Look at the first 500 or so characters of it 

In [34]:
pd.set_option("display.max_colwidth", 500)
df4

Unnamed: 0,austen
0,"Pride and Prejudice\nby Jane Austen\nChapter 1\nIt is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.\nHowever little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or other of their daughters.\n""My dear Mr. Bennet,"" said his lady to him one day, ""have you heard t..."


## Use a regular expression to find every "he" or "she" in the book. There should be about 3000 of them.

**Tip:** Do you know about **word boundaries?** `\b` means "the beginning of end of a word."

**Tip:** You might also want to use `re.IGNORECASE`. Maybe you'll need to google it? 

**Tip:** Do NOT use `re.compile`

In [35]:
#finding she or he
he_or_she = re.findall(r"\bs?he\b", austen, re.IGNORECASE)
len(he_or_she)

3047

## Use a regular expression to find those same "he" or "she"s, but also match *the word after it*

The first four should be:

* he is
* he had
* she told
* he came

In [36]:
#finding she or he and the following word
he_she_plus = re.findall(r"\bs?he\b \w+", austen, re.IGNORECASE)
he_she_plus

['he is',
 'he had',
 'she told',
 'he came',
 'he agreed',
 'he is',
 'he married',
 'he may',
 'he comes',
 'she ought',
 'he comes',
 'he chooses',
 'she is',
 'She was',
 'she was',
 'she fancied',
 'He had',
 'he should',
 'she had',
 'he suddenly',
 'She has',
 'She is',
 'she times',
 'she will',
 'she will',
 'he continued',
 'he wished',
 'she began',
 'she had',
 'he spoke',
 'he left',
 'he would',
 'he eluded',
 'He was',
 'he meant',
 'He had',
 'he had',
 'he saw',
 'he wore',
 'She could',
 'he could',
 'she began',
 'he might',
 'he ought',
 'he brought',
 'he had',
 'he was',
 'he was',
 'he was',
 'he was',
 'He was',
 'he would',
 'She is',
 'he looked',
 'he withdrew',
 'She is',
 'She told',
 'she had',
 'she had',
 'he was',
 'he had',
 'He had',
 'he had',
 'she entered',
 'she looked',
 'he actually',
 'she was',
 'he asked',
 'he asked',
 'he did',
 'he seemed',
 'she was',
 'he inquired',
 'she was',
 'he danced',
 'he had',
 'he would',
 'he had',
 'He is',
 

## Use capture groups to save the pronoun (he/she) as one match and the word as another

The first five should look like

```
[('he', 'is'),
 ('he', 'had'),
 ('she', 'told'),
 ('he', 'came'),
 ('he', 'agreed')]```

In [37]:
#capturing pronouns and following word in separate capture groups
re.findall(r"(\bs?he)\b (\w+)", austen, re.IGNORECASE)


[('he', 'is'),
 ('he', 'had'),
 ('she', 'told'),
 ('he', 'came'),
 ('he', 'agreed'),
 ('he', 'is'),
 ('he', 'married'),
 ('he', 'may'),
 ('he', 'comes'),
 ('she', 'ought'),
 ('he', 'comes'),
 ('he', 'chooses'),
 ('she', 'is'),
 ('She', 'was'),
 ('she', 'was'),
 ('she', 'fancied'),
 ('He', 'had'),
 ('he', 'should'),
 ('she', 'had'),
 ('he', 'suddenly'),
 ('She', 'has'),
 ('She', 'is'),
 ('she', 'times'),
 ('she', 'will'),
 ('she', 'will'),
 ('he', 'continued'),
 ('he', 'wished'),
 ('she', 'began'),
 ('she', 'had'),
 ('he', 'spoke'),
 ('he', 'left'),
 ('he', 'would'),
 ('he', 'eluded'),
 ('He', 'was'),
 ('he', 'meant'),
 ('He', 'had'),
 ('he', 'had'),
 ('he', 'saw'),
 ('he', 'wore'),
 ('She', 'could'),
 ('he', 'could'),
 ('she', 'began'),
 ('he', 'might'),
 ('he', 'ought'),
 ('he', 'brought'),
 ('he', 'had'),
 ('he', 'was'),
 ('he', 'was'),
 ('he', 'was'),
 ('he', 'was'),
 ('He', 'was'),
 ('he', 'would'),
 ('She', 'is'),
 ('he', 'looked'),
 ('he', 'withdrew'),
 ('She', 'is'),
 ('She', 't

## Save those matches into a dataframe

You can give the column names with `columns=['pronoun', 'verb']`

In [38]:
matches2 = re.findall(r"(\bs?he)\b (\w+)", austen, re.IGNORECASE)
df5 = pd.DataFrame(matches2, columns=['pronoun','verb'])

df5

Unnamed: 0,pronoun,verb
0,he,is
1,he,had
2,she,told
3,he,came
4,he,agreed
5,he,is
6,he,married
7,he,may
8,he,comes
9,she,ought


## How many times is each pronoun used?

In [39]:
df5.pronoun.value_counts()


she    1322
he     1054
She     325
He      234
Name: pronoun, dtype: int64

## Oh, wait, clean that up.

Make it only 'he' and 'she' lowercase.

It should be about 1600 'she' and 1300 'he'

In [40]:
df5['pronoun'] = df5['pronoun'].str.lower()

df5.pronoun.value_counts()



she    1647
he     1288
Name: pronoun, dtype: int64

## What are the top 20 most common verbs?

In [41]:
df5.verb.value_counts()

was               372
had               371
could             172
is                139
would              94
has                70
did                67
will               50
might              46
should             41
felt               38
must               37
said               33
saw                32
thought            31
added              31
then               26
replied            22
continued          21
looked             21
does               20
came               19
spoke              19
found              19
may                18
went               16
soon               16
knew               15
never              14
began              14
                 ... 
valued              1
bowed               1
detained            1
raised              1
herself             1
companion           1
disliked            1
promised            1
owns                1
eluded              1
hastened            1
improved            1
coming              1
persuaded           1
frightened

## What are the top 20 most common verbs for 'he', and the top 20 most common for 'she'

**Tip:** Don't use groupby, just filter. If you want to know how, though, you can also look at "value counts for different categories" on [this page](http://jonathansoma.com/lede/foundations-2017/classes/more-pandas/class-notes/)

In [42]:
#top20 for she
shes = df5[df5['pronoun'] == 'she']
shes.verb.value_counts().head(20)


was          212
had          205
could        132
is            65
would         59
did           38
felt          33
saw           29
will          26
might         25
added         23
has           21
said          20
thought       18
then          15
should        15
found         13
must          13
soon          12
continued     11
Name: verb, dtype: int64

In [43]:
#top20 for he
hes = df5[df5['pronoun'] == 'he']
hes.verb.value_counts().head(20)


had        166
was        160
is          74
has         49
could       40
would       35
did         29
should      26
must        24
will        24
might       21
replied     14
thought     13
may         13
came        13
said        13
does        12
never       12
meant       11
then        11
Name: verb, dtype: int64

## Who cries more, men or women? Give me a percentage answer.

**Tip:** It's `cried`, because of, you know, how books are written

In [44]:
she_cried = shes[shes['verb'] == 'cried']
#she_cried.verb.value_counts()

he_cried = hes[hes['verb'] == 'cried']
#he_cried.verb.value_counts()

#Another way to find out how many women and men are crying:
#she_cried = re.findall(r"(\bshe\b cried)", austen, re.IGNORECASE)
#she_cried
#he_cried = re.findall(r"(\bhe\b cried)", austen, re.IGNORECASE)
#he_cried


perc_she_cries = round((len(she_cried)/(len(she_cried) + len(he_cried))*100))

print("Out of crying people", perc_she_cries, "percent are women.")

Out of crying people 92 percent are women.


## How much more common is 'he' than 'she' in J.R.R. Tolkein's Fellowship of the Ring? How does that compare to Pride and Prejudice?

The book is in the same directory.

In [45]:
tolkien = open("data/Lord of the Rings - 01 - The Fellowship of the Ring - J. R. R. Tolkien - 1955.txt", encoding="utf8").read()
df6 = pd.DataFrame({'tolkien': [tolkien]})

In [46]:
matches3 = re.findall(r"(\bs?he)\b", tolkien, re.IGNORECASE)
df7 = pd.DataFrame(matches3, columns=['pronoun'])


In [47]:
df7['pronoun'] = df7['pronoun'].str.lower()

df_heshe_tolkien = round(df7.pronoun.value_counts(normalize = True), 2)*100

df_heshe_austen = round(df5.pronoun.value_counts(normalize = True), 2)*100
df_heshe_austen

she    56.0
he     44.0
Name: pronoun, dtype: float64

In [48]:
print("In Fellowship of the Ring there are", df_heshe_tolkien.he, "percent of 'he' and", df_heshe_tolkien.she, "percent of 'she', whereas in Pride and Prejudice there are", df_heshe_austen.he, "percent of 'he' and", df_heshe_austen.she, "percent of 'she'")

In Fellowship of the Ring there are 95.0 percent of 'he' and 5.0 percent of 'she', whereas in Pride and Prejudice there are 44.0 percent of 'he' and 56.00000000000001 percent of 'she'
