Skip to content

This repository contains our open source Curated News Analytical Dataset. The goal of this dataset is to give researchers access to our small data approach.

Notifications You must be signed in to change notification settings

CuratedNews/analyticaldataset

Repository files navigation

Analytical Dataset

Our analytical dataset focuses on collating quality news-related data. We have made it open-source so users can see our data and use it as a reference. Our dataset is open-source and freely available.

Check out our interactive search page/demo and/or codebook

Usage

Our JSON dataset is available for testing purposes only https://raw.githubusercontent.com/CuratedNews/analyticaldataset/main/CuratedNewsDataset.json

Data format example

title, link, & date of news article

{
 "title": "The three challenges keeping cars from being fully autonomous",
 "link": "https://mittr-frontend-prod.herokuapp.com/s/613399/the-three-challenges-keeping-cars-from-being-fully-autonomous/",
 "date": "2020-09-24 00:00:00 UTC",
 "titlewordcount": 9,
 "titlesentiment": "0.333333333333333",
 "titlesentimentoverall": "Positive",

source, topic, & leaning

 "Source": "MIT",
 "Topic": "Technology",
 "Leaning": "Academic",

other unique categorical variables (all unique variables must be operationalized/defined)

 "President": "Trump"
}

complete example (minified)

{"title": "The three challenges keeping cars from being fully autonomous","link": "https://mittr-frontend-prod.herokuapp.com/s/613399/the-three-challenges-keeping-cars-from-being-fully-autonomous/","date": "2020-09-24 00:00:00 UTC","titlewordcount": 9,"titlesentiment": "0.333333333333333","titlesentimentoverall": "Positive","Source": "MIT","Topic": "Technology","Leaning": "Academic","President": "Trump"}

Calculation of overall sentiment score

  • Overall positive = sentimentr score > 0
  • Overall negative = sentimentr score < 0
  • Overall neutral = sentimentr score = 0

Other documentation/references

How it works?

  • Check our interactive search page/demo for a hands-on with explanations of our analytical dataset
  • We've made a headlines textual classifier with this dataset. Check out the demo.

Contribute

Want to contribute? You can add unique categorical variables to the current data. Send us a pull request.

See our terms & conditions

Our terms & conditions

Our codebook

Want to know more?

Visit https://curatednews.xyz

About

This repository contains our open source Curated News Analytical Dataset. The goal of this dataset is to give researchers access to our small data approach.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages