Skip to content

Commit

Permalink
Update data.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kpb33132 committed Dec 16, 2023
1 parent 500f99e commit 3c6186e
Showing 1 changed file with 31 additions and 82 deletions.
113 changes: 31 additions & 82 deletions content/homepage/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,99 +4,48 @@ weight: 25
header_menu: true
---

**Download the data here**
Download the data here

The data I use is available for you to sort and slice however you like - Download it as an excel sheet or a .csv file. We also have some presorted options for you down in the visualizations area, under the "Raw Data and Notes" tab.
The data I use is available for you to sort and slice however you like - but the easy way to do it isn't working yet as we recover from our google issues. So for now look under the Raw Data and Notes tab in the visualizations area... once these buttons are working again I'll remove this sentenced. (11/25/2023) Download it as an excel sheet or a .csv file. We also have some presorted options for you down in the visualizations area, under the "Raw Data and Notes" tab.

If you have a case we missed, or found any errors, [please use the contact form at the bottom of this page](#contact)
If you have a case we missed, or found any errors, please use the contact form at the bottom of this page

{{< extlink-button href="https://docs.google.com/spreadsheets/d/1t6I-j30Nf7pTwl2i1snMbFWcTbWkYMtnk192JL1Og9k/edit#gid=1882457294" text="Data Source" >}} {{< extlink-button href="https://docs.google.com/spreadsheets/d/1t6I-j30Nf7pTwl2i1snMbFWcTbWkYMtnk192JL1Og9k/edit#gid=1882457294&single=true&output=csv" text="As CSV" >}}

{{< extlink-button href="https://docs.google.com/spreadsheets/d/1t6I-j30Nf7pTwl2i1snMbFWcTbWkYMtnk192JL1Og9k/edit#gid=1882457294" text="Data Source" >}}
{{< extlink-button href="https://docs.google.com/spreadsheets/d/1t6I-j30Nf7pTwl2i1snMbFWcTbWkYMtnk192JL1Og9k/edit#gid=1882457294&single=true&output=csv" text="As CSV" >}}

##### A note about who's included in the "religious affiliated" category.
A note about who's included in the "religious affiliated" category.
It is not just full time employees or ordained staff. The Bureau of Labor Statistics says there are only approximately 60,000 paid pastors in the United States. But with unpaid pastors religious organizations say the real number is closer to 600,000. That is just under two tenths of one percent of the American population (.0018 of population, to be precise).

We include in our "religious affiliated" total people who are named in the media reports we catalogue as pastors, youth pastors, priests, brothers, nuns, missionaries, bishops, deacons, church officials, Sunday school teachers, teachers in religious schools, etc. We do not include people who are just listed as members of a church.

We include in our "religious affiliated" total people who are named in the media reports we catalogue as pastors, youth pastors, priests, brothers, nuns, missionaries, bishops, deacons, church officials, Sunday school teachers, teachers in religious schools, etc. We do not include people who are just listed as members of a church.
For those who want per capita comparisons:
The transgender population of the United States has been estimated at anywhere from one half of one percent to two percent of the population, with some estimates as high as five percent of GenZ. Even though that is likely a substantial undercount of the transgender population (because many gender non conforming people live in the woodwork) we use the lower estimate, which would put the transgender population at 1,650,000, more than 2 1/2 times the size of the pastor population.

Where does the data come from?
I start the weekly research with the excellent {{}}. JoeMyGod tracks cases involving religious figures and politicians really well, and provides solid links to the stories about incidents and arrests. I add to those links those that I find through keyword searches using various search engines, and, recently, using AI. I also run a check against Reddit's {{}}.

This comment has been minimized.

Copy link
@darabos

darabos Jan 4, 2024

I think some links accidentally went missing in this change. In the current version there are three {{}} markers here that appear simply as "{{}}" on the site.


##### For those who want per capita comparisons:
It's my intention to make sure we have at least a year of full data on which to base a report. But there was definitely a learning curve in gathering the data, and my view is that during the first seven weeks or so data collection was inconsistent. While I'm trying to fill that data in using the techniques that have been developed, I expect to throw out those first seven weeks before putting together the report. That means the series will run at least through week 60, assuming nothing happens to prevent me from completing that task.

The transgender population of the United States has been estimated at anywhere from one half of one percent to two percent of the population, with some estimates as high as five percent of GenZ. Even though that is likely a substantial undercount of the transgender population (because many gender non conforming people live in the woodwork) we use the lower estimate, which would put the transgender population at 1,650,000, more than 2 1/2 times the size of the pastor population.
It's important to note, as you review the data, that the media story collection model has some important limitations.

##### Where does the data come from?
I start the weekly research with keyword searches using various search engines, and, recently, have experimented (not entirely successfully) with using AI. I also run a check against the excellent {{<extlink
text="JoeMyGod blog" href="https://www.joemygod.com/">}}. JoeMyGod
tracks cases involving religious figures and politicians really well,
and provides solid links to the stories about incidents and
arrests. I add to those links those that I find through Reddit's {{<extlink
text="Not a Drag Queen subredit" href="https://www.reddit.com/r/NotADragQueen//">}}, as well as a couple of other sources, including {{<extlink
text="Floodlit" href="https://www.floodlit.org//">}}
, which tracks Mormon Church offenders.
Web crawlers/spiders don't get to every website every day. Thus reports can get delayed, throwing off the week in which an incident appears in the TikTok reports - some incidents may get added into totals for the week or so following occurence.

It's my intention to make sure we have at least a year of full data on which to base a report. But there was definitely a learning curve in gathering the data, and my view is that during the first seven weeks or so data collection was inconsistent. While I'm trying to fill that data in using the techniques that have been developed, I expect to throw out those first seven weeks before putting together the report. That means the
series will run at least through week 60, assuming nothing happens to
prevent me from completing that task.
We can't be certain that the spiders get everywhere, so some reports could, theoretically, be missed.

It's important to note, as you review the data, that the media story collection model has some important limitations.
Not all arrests or convictions result in news coverage, and there is no central registry of cases. Those non-media reported cases will, therefore, not be reflected in the data.

It's assumed that many crimes go unreported. There's no data on whether those unreported crimes skew in any particular direction. Some commenters have assumed that churches cover up many incidents, but these reports make no assumptions about that.

One thing that in the past has thrown gross numbers off slightly is that a single incident will result in stories at various milestones. For example, a Catholic Cardinal generated stories when arrested, when going to trial and when found to be not competent to stand trial. I try to filter for that by either deleting duplicates (if you download data, look for the "deleted" column!) or updating existing entries, which creates a problem with the date on which an entry shows up, but otherwise should keep the gross numbers of offenders correct.

Why are the numbers in the data set different from the TikTok you just watched? Simple: I update as I receive and categorize reports. The TikToks are one week slices, and the data has likely been updated since the one you saw was recorded. And there's the effect from the corrections above - if there are multiple stories about a single perpetrator I'll filter that out of the TikTok totals, but the website may count the individual twice (we're working on a fix for that - a duplicate name filter - but it's not implemented yet.)

I avoid cases that are just "materials" cases, involving pictures/videos (those would, unfortunately, be far too many to catalogue) - and I also do not include "sting" cases, where the perpetrator is connecting with an undercover law enforcement officer and never actually is in contact with a child. The cases included here relate solely to actual assaults on children in the United States with the reports being published during the study time period. Note, however, that if the case involves someone making the materials or if they're either obtaining them from or sending them to minors, obviously the individuals are also alleged to be committing a crime directly involving a child, so they would be included.

So here we are. The results from the first six months have right wingers tossing around all sorts of accusations about "agendas" and "cherry picking." But the facts are these: I don't leave any cases out or attempt to tweak the data. It is what it is - which is what has them upset because it runs completely counter to their narrative. But as Sgt. Joe Friday said, "Just the facts, Ma'am."

So, dig into the data as you will. Let me know if you find anything that surprises you. I'm greatly indebted to Caleb, another TikTok user, who created the backend/database that provides the graphical interface on the "Data" page.

Remember, though, that this is not my real job, nor is it Caleb's. We're trying to make this accessible to all - but we both work for a living, and this isn't what either of us do, nor are we charging any money or trying to in any way make a profit from this.

1. Web crawlers/spiders don't get to every website every day. Thus
reports can get delayed, throwing off the week in which an incident
appears in the TikTok reports - some incidents may get added into
totals for the week or so following occurence.

1. We can't be certain that the spiders get everywhere, so some
reports could, theoretically, be missed.

1. Not all arrests or convictions result in news coverage, and there
is no central registry of cases. Those non-media reported cases will,
therefore, not be reflected in the data.

1. It's assumed that many crimes go unreported. There's no data on
whether those unreported crimes skew in any particular direction. Some
commenters have assumed that churches cover up many incidents, but
these reports make no assumptions about that.

1. One thing that in the past has thrown gross numbers off slightly is that a single
incident will result in stories at various milestones. For example, a
Catholic Cardinal generated stories when arrested, when going to trial
and when found to be not competent to stand trial. I try to filter for that
by either deleting duplicates (if you download data, look for the "deleted" column!) or updating existing entries, which creates a problem with the date on which an entry shows up, but otherwise should keep the gross numbers of offenders correct.

1. Why are the numbers in the data set different from the TikTok you
just watched? Simple: I update as I receive and categorize
reports. The TikToks are one week slices, and the data has likely been
updated since the one you saw was recorded. And there's the effect
from the corrections above - if there are multiple stories about a
single perpetrator I'll filter that out of the TikTok totals, but the
website may count the individual twice (we're working on a fix for
that - a duplicate name filter - but it's not implemented yet.)

I avoid cases that are just "materials" cases, involving
pictures/videos (those would, unfortunately, be far too many to
catalogue) - and I also do not include "sting" cases, where the perpetrator is connecting with an undercover law enforcement officer and never actually is in contact with a child. The cases included here relate solely to actual assaults on
children in the United States with the reports being published during the study time period. Note, however, that if the case involves someone making the materials or if they're either obtaining them from or sending them to minors,
obviously the individuals are also alleged to be committing a crime
directly involving a child, so they would be included.

So here we are. The results from the first six months have right
wingers tossing around all sorts of accusations about "agendas" and
"cherry picking." But the facts are these: I don't leave any cases
out or attempt to tweak the data. It is what it is - which is what has
them upset because it runs completely counter to their narrative. But
as Sgt. Joe Friday said, "Just the facts, Ma'am."

So, dig into the data as you will. Let me know if you find anything
that surprises you. I'm greatly indebted to Caleb, another TikTok
user, who created the backend/database that provides the graphical
interface on the "Data" page.

Remember, though, that this is not my real job, nor is it
Caleb's. We're trying to make this accessible to all - but we both
work for a living, and this isn't what either of us do, nor are we charging any money or trying to in any way make a profit from this.

##### THIS DATA IS UNITED STATES ONLY
And is valid for the period we're surveying
only. If it's an incident outside the USA or prior to mid February
2023, it's outside of our survey period and not part of our data.
THIS DATA IS UNITED STATES ONLY
And is valid for the period we're surveying only. If it's an incident outside the USA or prior to mid February 2023, it's outside of our survey period and not part of our data.

0 comments on commit 3c6186e

Please sign in to comment.