## Control Cohort Selection
To obtain our control cohort of patients without any diagnosis of ASD, we queried the entire insurance database for all members with more than 12 months of insurance coverage, and valid birth years. 

We create temporary table (#tmpMemberCountNoDate) that includes every insurance member:
- between the ages of 0 and 100 years-old
- with at least 12 months of insurance coverage.

Rows distinguished distinct members and columns included member ID and the duration (months) of insurance coverage

In [None]:
dbSendUpdate( cn, "SELECT E.MemberId, COUNT ( DISTINCT E.EffectiveDate ) AS nMonthsObservation
INTO #tmpMemberCountNoDate
FROM Enrollment E
INNER JOIN Members M 
ON E.MemberId = M.MemberId
INNER JOIN FactIcd F 
ON M.MemberId = F.MemberId
WHERE
( YEAR ( DateServiceStarted ) - BirthYear ) >= 0
AND ( YEAR ( DateServiceStarted ) - BirthYear ) <= 100
GROUP BY E.MemberId
HAVING  COUNT ( DISTINCT E.EffectiveDate ) >= 12
")

Filter prior table into a new temporary table (#tmpNonASDMemberCountNoDate)
That excludes all members with at least one diagnosis of ASD (from existing ASDMembersAtLeast1 table)

In [None]:
dbSendUpdate( cn, "SELECT * 
INTO #tmpNonASDMemberCountNoDate
FROM #tmpMemberCountNoDate
WHERE MemberId NOT IN ( 
SELECT MemberId 
FROM ASDMembersAtLeast1 )")

Create a new temporary table that adds the dates that services were started (separate table due to size limitations)
Dates of service were restricted to only 2008 to 2020 (database lifespan) to exclude any invalid dates
No longer needing to include a column for duration of insurance coverage


In [None]:
dbSendUpdate( cn, "SELECT T.MemberId, YEAR ( DateServiceStarted ) AS YearServiceStarted
INTO #tmpNonASDMemberCountwDate
FROM #tmpNonASDMemberCountNoDate T
INNER JOIN FactIcd F
ON F.MemberId = T.MemberId
GROUP BY T.MemberId, DateServiceStarted
HAVING ( YEAR ( DateServiceStarted ) ) <= 2020
AND ( YEAR ( DateServiceStarted ) ) >= 2008
")

Create temporary control cohort members table, with rows composed of distinct members,
And columns including member ID, gender (sex assigned at birth), birth year, earliest recorded zip code,
And age (reported as current year subtracted by birth year)

In [None]:
dbSendUpdate( cn, "SELECT T.MemberId, M.Gender, M.BirthYear, MIN ( E.ZipCode ) AS ZipCode, ( T.YearServiceStarted - BirthYear ) AS Age
INTO #NonASDMemberDiagnoses
FROM #tmpNonASDMemberCountwDate T
INNER JOIN Members M 
ON M.MemberId = T.MemberId
INNER JOIN Enrollment E 
ON E.MemberId = T.MemberId
WHERE ( T.YearServiceStarted - BirthYear ) >= 0
AND ( T.YearServiceStarted - BirthYear ) <= 100
GROUP BY T.MemberId, YearServiceStarted, Gender, BirthYear
")

Create final control cohort (non-ASD) members table with the following information:
- member ID,
- Gender (sex assigned at birth)
- birth year
- age at 1st (non-ASD) diagnosis
- earliest recorded zip code

In [None]:
dbSendUpdate( cn, "SELECT MemberID,Gender,BirthYear,MIN (Age) AS FirstDiagnosedAge,MIN (ZipCode) AS ZipCode
INTO #tmpNonASDMembers
FROM #NonASDMemberDiagnoses
GROUP BY MemberID, Gender, BirthYear
")

### Control Cohort Demographics
We explored the NonASDMembers table to describe the demographics of our control cohort of members without a diagnosis of ASD, including:
- the distribution of gender (sex assigned at birth)
- age at first diagnosis by age group (0-2, 3-5, 5-11, 11-18, and 18+ year-olds)
- regional distribution across the U.S.


Count the number of males versus females, distribution of ages at first diagnosis (grouped by 0-2, 3-5, 5-11, 11-18, and 18 and older years-old) in the non-ASD control cohort


In [None]:
dbSendUpdate( cn, "SELECT
  COUNT ( CASE WHEN Gender = 'M' THEN 1 END ) AS nMale,
  COUNT ( CASE WHEN Gender = 'F' THEN 1 END ) AS nFemale,
  COUNT ( CASE WHEN FirstDiagnosedAge >= 0 AND FirstDiagnosedAge < 2 THEN 1 END ) AS zero_to_two,
  COUNT ( CASE WHEN FirstDiagnosedAge >= 2 AND FirstDiagnosedAge < 5 THEN 1 END ) AS three_to_five,
  COUNT ( CASE WHEN FirstDiagnosedAge >= 5 AND FirstDiagnosedAge < 11 THEN 1 END ) AS five_to_eleven,
  COUNT ( CASE WHEN FirstDiagnosedAge >= 11 AND FirstDiagnosedAge < 18 THEN 1 END ) AS elevent_to_eighteen,
  COUNT ( CASE WHEN FirstDiagnosedAge >= 18 THEN 1 END ) AS eighteen_plus,
  COUNT ( MemberId ) AS TOTAL
FROM #tmpNonASDMembers")

To describe the regional distribution of the non-ASD control cohort, we first created a table that maps member ID and zip code to the state associated with that zip code. 

In [None]:
##Create table (RegionalMap_NonASD) mapping member ID and member zip code to associated states

dbSendUpdate( cn, "SELECT NonASD.MemberId, Zips.ZipCode, Zips.State
INTO RegionalMap_NonASD
FROM #tmpNonASDMembers NonASD, USGeography.dbo.UspsZipCodeRegions Zips
GROUP BY MemberId, NonASD.ZipCode, Zips.ZipCode, Zips.State
HAVING NonASD.ZipCode = Zips.ZipCode
")

dbGetQuery( cn, "SELECT COUNT ( MemberId ) 
FROM RegionalMap_NonASD" )

Then, as with the ASD cohort regional descriptions, separate maps were created for each region of the U.S. (West, Midwest, Southwest, Southeast, Northeast, and non-contiguous U.S. states and territories) depending on the state. These region-specific maps were used in conjunction with member ID to count the non-ASD members with zip codes associated with each region.

#### Western states

In [None]:
##Create a temporary table mapping zip codes associated with the Western states

dbSendUpdate( cn, "SELECT DISTINCT ZipCode, State,'West' AS Region
INTO #tmpWestZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'WA', 'OR', 'ID', 'MT', 'WY', 'CO', 'UT', 'NV', 'CA' )")

--Count the number of patients in the non-ASD control cohort with Western state zip codes

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpWestZipCodeMap_NonASD )")

#### Midwestern States

In [None]:
dbSendUpdate( cn, "SELECT DISTINCT ZipCode,State, 'Midwest' AS Region
INTO #tmpMidwestZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'MN', 'WI', 'MI', 'OH', 'IN', 'IL', 'IA', 'MO', 'KS', 'NE', 'SD', 'ND' )
")

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpMidwestZipCodeMap_NonASD )
")

#### Southwestern States

In [None]:
dbSendUpdate( cn, "SELECT DISTINCT ZipCode,State, 'Southwest' AS Region
INTO #tmpSouthwestZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'OK', 'TX', 'NM', 'AZ' )
")

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpSouthwestZipCodeMap_NonASD )
")

#### Southeastern States

In [None]:
dbSendUpdate( cn, "SELECT DISTINCT ZipCode, State, 'Southeast' AS Region
INTO #tmpSoutheastZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'WV', 'DE', 'MD', 'DC', 'VA', 'NC', 'SC', 'KY', 'TN', 'GA', 'FL', 'AL', 'MS', 'LA', 'AR' )
")

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpSoutheastZipCodeMap_NonASD )
")

#### Northeastern States

In [None]:
dbSendUpdate( cn, "SELECT DISTINCT ZipCode,State, 'Northeast' AS Region
INTO #tmpNortheastZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'NJ', 'PA', 'NY', 'CT', 'RI', 'MA', 'NH', 'VT', 'ME' )
")

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpNortheastZipCodeMap_NonASD )
")

#### Non-contiguous U.S. states and territories

In [None]:
dbSendUpdate( cn, "SELECT DISTINCT ZipCode, State, 'NonContiguous' AS Region
INTO #tmpNonContiguousZipCodeMap_NonASD
FROM RegionalMap_NonASD
WHERE State in ( 'AK', 'HI', 'PR', 'VI')
")

dbGetQuery( cn, "SELECT COUNT ( DISTINCT MemberId )
FROM RegionalMap_NonASD
WHERE ZipCode IN ( SELECT ZipCode FROM #tmpNonContiguousZipCodeMap_NonASD )
")