Skip to content

HyperCrowd/word-cluster-matrix

Repository files navigation

Grammar Fields

Grammar Fields are a psychosecurity construct used for analyzing grammar behavior and language usage over time, with a focus on deviations from the norm, the acquisition of new words, the evolution of grammar, and the passage of time in linguistic development.

To analyze language, we collect and organize all the text that a person has written, arranging it chronologically. We then determine the frequency of singularized "go words" (words that are actively used) and arrange them in ascending order. This compilation forms the basis of the "grammar field," which is then examined for changes in patterns and the speed at which these changes occur. The goal is to make it compatible with neural networks to detect how various factors such as influential individuals, trends, cults, artificial intelligence, and advertising campaigns impact language usage.

This presentation about Grammar Fields will help you understand their underlying principles.

Grammar Fields

You can review the slides in this presentation here

Technical Definition

The grammar field is comprised of two axes: Time Grouping (x) and Brevity Rank (y).

Time Grouping is hourly, daily, weekly, or monthly. (Custom grouping coming soon)

Brevity Rank is the number of times unique words are used.

Where x and y intersect is the Intensity. For example, if you have a Brevity Rank of 14 and a Time Grouping for monthly at Month 1 (Janurary), and the number where these two intersect is 163, then Brevity 14 for Janurary has 163 occurances of words being used 14 times.

This is NOT a Natural Language Processing approach. We do not analyze the words or the grammar connections themselves. (Detailed reasoning for this will come later) We are NOT doing sentiment analysis. At a high level, we assume that a lower brevity means an increased likelihood of relying on more intricate grammar while high brevity means repetition of preferred grammar.

For example: if a person uses the words "cat", "jail", and "linger" each 5 times in Janurary, then their Brevity Rank 5 for that Time Grouping will have an Intensity of 3. (They used three unique words five times in Janurary)

Example

The following image highlights these points:

images/example.png

These are the annual grammar fields of Elon Musk between 2015 and 2017

We see a slurring effect happening to his grammar fields after Trump was elected.

We also see that by 2017, his usage of rare words increases significantly since 2015.

Additional forensics will reveal these clusters can help us identify what egregores Elon was influenced by.

Installation

npm install github:HyperCrowd/word-cluster-matrix

Usage

Google Colabs

You can test Grammar Fields and upload custom CSVs right now by clicking on the button below:

Open In Colab

Google Console

Start tinkering with Grammar Fields at the CLI level with Google Console

Open in Cloud Shell

Docker

The Docker implementation creates a data directory with csv and fields folders in it. Each of these folders has a folder for the field modes (absolute, differential, and normalized). In each mode folder is a folder for each time grouping (hourly, daily, weekly, and yearly)

When you add a CSV to the mode and time grouping folder in the csvs directory, the Docker instance will create a field of that CSV file in the corresponding mode and time groupding folders in the fields folder. For example, if you create a CSV in /data/csvs/absolute/daily/test.csv, the Docker instance will create a field output in /data/fields/absolute/daily/test.csv

The Docker instance will poll for new files and changes in the csvs folders every ten seconds.

Please make sure the CSVs you add to the appropriate csvs folder have the first row as headers where the time column has a header named time and the text you wish to analyze has a header of text

To run Grammar Fields in Docker mode, simply run:

./docker.sh

You can test this Docker implementation by clicking here and typing ./docker.sh in the console:

Open in Cloud Shell

Local CLI

For command line processing of CSV, this example will cover most cases. Replace source.csv with the CSV you want, and output.csv with the file name of the grammar field output:

cat source.csv | src/cli.js field.csv

In this example, we utilize every flag:

cat source.csv | src/cli.js -t "timestamp" -x "messages" -b "daily" -m "normalized" field.csv
  • --time, -t: The name of the time column name in the CSV.
  • --text, -x: The name of the text column name in the CSV.
  • --mode, -m: The type of grammar field you want. (absolute, differential, normalized)
  • --breakdown, -b: What time groupings the grammar field is structured as. (hourly, daily, weekly, monthly)

JavaScript

const { GrammarField } = require('grammar-field');

// Possible tweets
const tweets = [
  'what is the deal?',
  'wow, this is not the cringe I was looking for',
  'gotta pay rent or else i will be broke AND without protection from the elements',
  'AI is going to replace my job, but it will never replace my heart uwu'  
]

// JavaScript timestamps
const times = [
  1322751311000,
  1322753344000,
  1322918428000,
  1322918527000,
  1322987632000
]

// Generates a grammar field where the times, the brevity ranks, and values are actual values 
const a = new GrammarField(tweets, times) 

// Generates a grammar field where the times, the brevity ranks, and values are actual values and the x axis label (time) is divided by 1000 
const b = new GrammarField(tweets, times, 'ABSOLUTE', x => x / 1000) 

// Generates a grammar field where the times, the brevity ranks, and values are differentials based on their respective prior values
const c = new GrammarField(tweets, times, 'RELATIVE')

// Generates a grammar field where the times, the brevity ranks, and values are differentials based on their respective prior values and the x axis label (time) is divided by 1000
const d = new GrammarField(tweets, times, 'RELATIVE', x => x + 's')

// Generates a grammar field where there are 100 times, 100 ranks, and 100x100 values, and all values have been normalized to this scale of 100
const e = new GrammarField(tweets, times, 'NORMALIZED')

Examples

Please see the following tests for examples on how to programmatically use Grammar Fields:

  • Grammar fields: How to use the basic features ofa Grammar Field
  • Data loaders: How to load data into a Grammar Field
  • Features: How to perform mathematical analysis on both the Brevity Ranks or the Time Grouping of a Grammar Field.

Contribution

You can edit and toy around with the code very quickly in StackBlitz by clicking on this button:

Open in StackBlitz

About

Generate influence operation signatures from a messaging timeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published