# Step 1: Get the data

We want to analyze cases that determine if the court can pierce the corporate veil of a child or subsidiary corporation to access the assets of a parent corporation.  

We want to focus on torts cases.

**First,** 
I will use WestLaw create a list of case names.

I will use WestLaw for this step because of their thorough annotation; this will be my proxy for hypothetical grad student who can provide me with a labeled dataset. 

**Second,**
I will use the list of case names to scrape courtlistener.com for the full opinions.

## Part One: WestLaw

### 1. I search West Key Number Headnotes. 

### 2. "Disregarding Corporate Entity --> Piercing Corporate Veil"

<br>30,289 cases

### 3. ... --> "Separate corporates; disregarding separate entities" --> "Parent and subsidiary corporations in general" (#1053)

<br>This produces **1,726 cases**

### 4. Within this group, the labeled "Reasons for Piercing" are:

In general (53)

Debts and obligations of corporation in general (138)

Evasion or violation of law or orders in general (67)

Family and domestic relations. (1)

Environmental liabilities and violations (122)

Labor and employment liabilities and violations (178)

Insolvency, bankruptcy, and receivership (70)

Equipment leases. (4)

Landlord and tenant (38)

Liens, bonds, notes, and mortgages (29)

Sale and transfer of property (14)

Shareholders, officers, and corporation, rights inter se. (17)

Taxation in general (40)

Income and excess profits tax. (23)

**Torts in general (200)**

Fraud (30)

Negligence (52)

Patent, copyright, and trademark ownership and infringement (29)

Wills and decedents' estates (1)
________________

Checking BOTH the "Parent / Subisdiary" and "Torts" headnotes **only accounts for 200 cases.**

This is not enough. 

I broaden the range of headnotes I am selecting. I suspect there are cases I'm missing that include parents and subsidiaries and torts, but do not have these labels.


### 5. To capture those cases which involve torts, but may not be labeled, I run a text query instead of selecting the "Torts" headnote:

"*tort! OR fraud! OR law OR environment! or negligen! OR infring! OR debt OR employ! OR insolven! or bankrupt!*"


(I based this text query on the WestLaw Headnotes categories).

________
This returns **914 results.** 

I want at least 1,000 results. 

### 6. So I expand the Headnotes to include:

i. Related corporations in general (154)

ii. Parent and subsidiary corporations in general (1,726)

iii. Single business enterprise (342)

iv. Common enterprise (99)

v. Identity of directors, officers, or shareholders (543)


I review a handful of these cases, and determine that -- while a majority involve piercing one corporation to get the assets of another corporation -- the specific relationship between the corporations is called a variety of things beyond "parent," "child," or "subsidiary."

### 7. Then I run the same "torts" (or "non-contracts") text query, and get:


 **1,493 cases.**
 
 _______________
 
 ### 8. What if cases involve parents and subsidiaries, but are not labeled?
 
I want to be thorough, so I run the same search from the opposite side, beginning with the "*Reasons for Piercing --> Torts*" headnote this time:
 
### 9. I Use the Headnotes to select torts cases 

Torts in general (527)

This is already too small. 

### 10. I include all "Reasons for Piercing" Headnotes that are NOT "contracts in general"

This includes:

In general (53)

Debts and obligations of corporation in general (138)

Evasion or violation of law or orders in general (67)

Family and domestic relations. (1)

Environmental liabilities and violations (122)

Labor and employment liabilities and violations (178)

Insolvency, bankruptcy, and receivership (70)

Equipment leases. (4)

Landlord and tenant (38)

Liens, bonds, notes, and mortgages (29)

Sale and transfer of property (14)

Shareholders, officers, and corporation, rights inter se. (17)

Taxation in general (40)

Income and excess profits tax. (23)

Torts in general (200)

Fraud (30)

Negligence (52)

Patent, copyright, and trademark ownership and infringement (29)

Wills and decedents' estates (1)

____________

Remedies and Procedure (599)

In general (24)

Jurisdiction and venue in general (99)

Jurisdiction over shareholders, directors, or officers of foreign corporations (383)

Parties and standing (14)

Process (50)

Arbitration (29)

_______________

this adds up to *6,534 cases* 

### 10. Then I run a text query for parents and subsidiaries

"*adv: parent OR subsidiar!*"

This returns **1,705 cases**

Added with the other cases found above, this amounts to a log of:


**3,208 possible cases**

## Part Two - consolidate the cases

In [1]:
import pandas as pd

This problem is likely to be solved by installing an updated version of `importlib-metadata`.


In [2]:
df1 = pd.read_csv("data/pg1_Westlaw PrecisionListof1000headnotesforIIDISREGARDINGCORPORATEENTITYAnd8201PIERCINGC.csv")

In [3]:
df2 = pd.read_csv("data/pg_2Westlaw PrecisionListof711headnotesforIIDISREGARDINGCORPORATEENTITYAnd8201PIERCINGCO.csv")

In [4]:
df3 = pd.read_csv("data/WestlawPrecisionListof579headnotesforIIDISREGARDINGCORPORATEENTITYAnd8201PIERCINGCO.csv")

In [5]:
df4 = pd.read_csv("data/WestlawPrecisionListof914headnotesfork1053And8212Parentandsubsidiarycorporationsing.csv")

In [6]:
len(df4), len(df3), len(df2), len(df1), len(df4)+len(df3)+len(df2)+len(df1)

(915, 580, 712, 1001, 3208)

_______________

Now merge all of the datasets

In [52]:
all_data = df4.append(df3).append(df2).append(df1)

In [53]:
len(all_data)

3208

Some of this data is empty. (see 'NaN' ("not a number," i.e., null).  

In [64]:
all_data.iloc[0:1]

Unnamed: 0,Citation,Court,Date,Document Preview,Document URL,Headnote,Key Number,KeyCite Treatment,KeyCite URL,Title,Type,Unnamed: 11
0,--- F.Supp.3d ----,"United States District Court, E.D. Pennsylvania.","April 24, 2024",LITIGATION Jurisdiction. Customer of trash ha...,https://www.westlaw.com/Document/Ie1dbd6700307...,Customer of trash and recycling hauler failed ...,101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,"Salyers v. A.J. Blosenski, Inc.",Custom Digest,


Let's replace those 'NaN's with blank strings ('') so we can see which columns we can get rid of. 

In [75]:
step_1 = all_data.dropna(axis=1, how='all')
noNa_data = step_1.fillna('')

In [77]:
noNa_data.head()

Unnamed: 0,Citation,Court,Date,Document Preview,Document URL,Headnote,Key Number,KeyCite Treatment,KeyCite URL,Title,Type
0,--- F.Supp.3d ----,"United States District Court, E.D. Pennsylvania.","April 24, 2024",LITIGATION Jurisdiction. Customer of trash ha...,https://www.westlaw.com/Document/Ie1dbd6700307...,Customer of trash and recycling hauler failed ...,101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,"Salyers v. A.J. Blosenski, Inc.",Custom Digest
1,690 S.W.3d 396,"Court of Appeals of Texas, Amarillo.","April 15, 2024",BUSINESS ORGANIZATIONS Limited Liability Comp...,https://www.westlaw.com/Document/I88268850fbf5...,A subsidiary corporation and its parent corpor...,101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,Ovation Finance Holdings 5 LLC v. G.E.T. Marke...,Custom Digest
2,--- F.Supp.3d ----,"United States District Court, D. Rhode Island.","March 14, 2024",PRODUCTS LIABILITY Preemption. Patient's prod...,https://www.westlaw.com/Document/I506a1730e25d...,"Under Rhode Island law, in order to pierce the...",101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,"Franks v. Coopersurgical, Inc.",Custom Digest
3,--- F.Supp.3d ----,"United States District Court, D. Rhode Island.","March 14, 2024",PRODUCTS LIABILITY Preemption. Patient's prod...,https://www.westlaw.com/Document/I506a1730e25d...,"Under Rhode Island law, in order to pierce the...",101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,"Franks v. Coopersurgical, Inc.",Custom Digest
4,--- F.Supp.3d ----,"United States District Court, S.D. New York.","February 1, 2024",LABOR AND EMPLOYMENT Discrimination. Former e...,https://www.westlaw.com/Document/Ic7275680c1a6...,"In some circumstances, courts may pierce the c...",101 CORPORATIONS AND BUSINESS ORGANIZATIONS > ...,,,"Doheny v. International Business Machines, Corp.",Custom Digest


Some of these cases look identical (see rows 2 & 3 above). 

Let's consolidate the dataframe to remove duplicate cases. 

(We want to be sure we only remove TRUE duplicates, without getting rid of different opinions written pertaining to the same matter, so let's group by:
1. case name ["Title"]
2. Date
3. Citation
4. Court

This will take those rows with identical: 1. parties, 2. dates of decision, 3. citations, and 4. courts, and lump them together into the same row. 

**What should we do with the values in the other columns?**

I'm going to save the values in a list. 

(We can perform a check to see if all the values in the list are identical later.)

In [79]:
grouped_data = noNa_data.groupby(['Title', 'Court', 'Date', 'Citation'], as_index=False).agg(lambda x: list(x))

In [82]:
grouped_data.head()

Unnamed: 0,Title,Court,Date,Citation,Document Preview,Document URL,Headnote,Key Number,KeyCite Treatment,KeyCite URL,Type
0,,,,,"[, , , , , , , , , ]","[, , , , , , , , , ]","[, , , , , , , , , ]","[, , , , , , , , , ]","[, , , , , , , , , ]","[, , , , , , , , , ]",[Copyright 2024 Thomson Reuters/West. No Claim...
1,"17315 Collins Ave., LLC v. Fortune Development...","District Court of Appeal of Florida, Third Dis...","May 5, 2010",34 So.3d 166,[BUSINESS ORGANIZATIONS - Piercing Corporate V...,[https://www.westlaw.com/Document/I91d1921e582...,[Parent company and subsidiary operated as alt...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[Yellow KeyCite, Yellow KeyCite]",[https://www.westlaw.com/Link/RelatedInformati...,"[Custom Digest, Custom Digest]"
2,"21st Century Financial Services, LLC v. Manche...","United States District Court, S.D. California.","June 8, 2017",255 F.Supp.3d 1012,[BUSINESS ORGANIZATIONS Partnerships. General...,[https://www.westlaw.com/Document/I5b0882d04cd...,[Common principles apply in determining alter ...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,[Yellow KeyCite],[https://www.westlaw.com/Link/RelatedInformati...,[Custom Digest]
3,"3-D Elec. Co., Inc. v. Barnett Const. Co.","Court of Appeals of Texas, Dallas.","January 30, 1986",706 S.W.2d 135,[Texas electrical contractor brought breach of...,[https://www.westlaw.com/Document/I30b362b8e79...,[Court will not disregard separate legal entit...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[Yellow KeyCite, Yellow KeyCite, Yellow KeyCit...",[https://www.westlaw.com/Link/RelatedInformati...,"[Custom Digest, Custom Digest, Custom Digest, ..."
4,"A & I Realty Corp. v. Kent Dry Cleaners, Inc.","District Court, Nassau County, New York, Secon...","December 19, 1969",61 Misc.2d 887,[Action by landlord against corporate tenant f...,[https://www.westlaw.com/Document/I926e6ac5d8c...,[Where corporation is so dominated and the int...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[, ]","[, ]","[Custom Digest, Custom Digest]"


Let's see how many *unique* cases we have. 

(after getting rid of that first row, which is an artifact from one of the original .csv files)

In [85]:
grouped_data_1 = grouped_data.drop(axis=0, index=0)
len(grouped_data_1)

1764

#### We took our combined dataset of 3,208 cases, and revealed that -- of those cases -- only 1,764 were unique. 

## Next steps: Analyze the case metadata

Date range?

Courts where cases were heard?

Surviving headnotes?

## Next next steps: scrape the opinions

Use the case names to scrape their opinions from courtlistener.com

In [86]:
grouped_data_1.head()

Unnamed: 0,Title,Court,Date,Citation,Document Preview,Document URL,Headnote,Key Number,KeyCite Treatment,KeyCite URL,Type
1,"17315 Collins Ave., LLC v. Fortune Development...","District Court of Appeal of Florida, Third Dis...","May 5, 2010",34 So.3d 166,[BUSINESS ORGANIZATIONS - Piercing Corporate V...,[https://www.westlaw.com/Document/I91d1921e582...,[Parent company and subsidiary operated as alt...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[Yellow KeyCite, Yellow KeyCite]",[https://www.westlaw.com/Link/RelatedInformati...,"[Custom Digest, Custom Digest]"
2,"21st Century Financial Services, LLC v. Manche...","United States District Court, S.D. California.","June 8, 2017",255 F.Supp.3d 1012,[BUSINESS ORGANIZATIONS Partnerships. General...,[https://www.westlaw.com/Document/I5b0882d04cd...,[Common principles apply in determining alter ...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,[Yellow KeyCite],[https://www.westlaw.com/Link/RelatedInformati...,[Custom Digest]
3,"3-D Elec. Co., Inc. v. Barnett Const. Co.","Court of Appeals of Texas, Dallas.","January 30, 1986",706 S.W.2d 135,[Texas electrical contractor brought breach of...,[https://www.westlaw.com/Document/I30b362b8e79...,[Court will not disregard separate legal entit...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[Yellow KeyCite, Yellow KeyCite, Yellow KeyCit...",[https://www.westlaw.com/Link/RelatedInformati...,"[Custom Digest, Custom Digest, Custom Digest, ..."
4,"A & I Realty Corp. v. Kent Dry Cleaners, Inc.","District Court, Nassau County, New York, Secon...","December 19, 1969",61 Misc.2d 887,[Action by landlord against corporate tenant f...,[https://www.westlaw.com/Document/I926e6ac5d8c...,[Where corporation is so dominated and the int...,[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,"[, ]","[, ]","[Custom Digest, Custom Digest]"
5,"A.G. Cullen Const., Inc. v. Burnham Partners, LLC","Appellate Court of Illinois, First District, T...","March 11, 2015",2015 IL App (1st) 122538,[LITIGATION - Fraudulent Conveyances. LLC and ...,[https://www.westlaw.com/Document/Ida6382dac99...,"[Under Delaware law, generally, the corporate ...",[101 CORPORATIONS AND BUSINESS ORGANIZATIONS >...,[],[],[Custom Digest]


In [95]:
grouped_data_1['Type'].apply(lambda x: set(x))

1       {Custom Digest}
2       {Custom Digest}
3       {Custom Digest}
4       {Custom Digest}
5       {Custom Digest}
6       {Custom Digest}
7       {Custom Digest}
8       {Custom Digest}
9       {Custom Digest}
10      {Custom Digest}
11      {Custom Digest}
12      {Custom Digest}
13      {Custom Digest}
14      {Custom Digest}
15      {Custom Digest}
16      {Custom Digest}
17      {Custom Digest}
18      {Custom Digest}
19      {Custom Digest}
20      {Custom Digest}
21      {Custom Digest}
22      {Custom Digest}
23      {Custom Digest}
24      {Custom Digest}
25      {Custom Digest}
26      {Custom Digest}
27      {Custom Digest}
28      {Custom Digest}
29      {Custom Digest}
30      {Custom Digest}
             ...       
1735    {Custom Digest}
1736    {Custom Digest}
1737    {Custom Digest}
1738    {Custom Digest}
1739    {Custom Digest}
1740    {Custom Digest}
1741    {Custom Digest}
1742    {Custom Digest}
1743    {Custom Digest}
1744    {Custom Digest}
1745    {Custom 

1    {Custom Digest}
2    {Custom Digest}
3    {Custom Digest}
4    {Custom Digest}
5    {Custom Digest}
Name: Type, dtype: object