Skip to content

deonpollard/ScanningData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Welcome!

Welcome to the ScanningData GitHub Repository! This Repository contains sets of scripts essential for data scanning in support of the seeWaybeyond Platform, although it could be use completely stand-alone. Early days for the Repository.

Scanning the following Data

Structured Data

Structured data usually resides in relational databases (RDBMS). Fields store length-delineated data phone numbers, Social Security numbers, or ZIP codes. Even text strings of variable length like names are contained in records, making it a simple matter to search.

UnStructured Data

Unstructured data has internal structure but is not structured via pre-defined data models or schema. It may be textual or non-textual, and human- or machine-generated. It may also be stored within a non-relational database like NoSQL. Typically includes:

  • Text files: Word processing, spreadsheets, presentations, email, logs.
  • Email: Email has some internal structure thanks to its metadata, and we sometimes refer to it as semi-structured. However, its message field is unstructured and traditional analytics tools cannot parse it.
  • Social Media: Data from Facebook, Twitter, LinkedIn.
  • Website: YouTube, Instagram, photo sharing sites.
  • Mobile data: Text messages, locations.
  • Communications: Chat, IM, phone recordings, collaboration software.
  • Media: MP3, digital photos, audio and video files.
  • Business applications: MS Office documents, productivity applications.
  • Typical machine-generated unstructured data includes:
  • Satellite imagery: Weather data, land forms, military movements.
  • Scientific data: Oil and gas exploration, space exploration, seismic imagery, atmospheric data.
  • Digital surveillance: Surveillance photos and video.
  • Sensor data: Traffic, weather, oceanographic sensors.

Semi-Structured Data

Semi-structured data maintains internal tags and markings that identify separate data elements, which enables information grouping and hierarchies. Amongst others includes:

  • Markup language XML This is a semi-structured document language.
  • Open standard JSON (JavaScript Object Notation) JSON is another semi-structured data interchange format.

Other Data

Loosely grouped for now.

  • 3RD Party Data
  • Cloud Data
  • Big Data

About

List of Powershell scripts to scan DATA

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published