-
Notifications
You must be signed in to change notification settings - Fork 0
1. What is Data?
Note: Content in this section is adapted from: Atenas, J., Bonina, C., Pane, J., & Belbis, J. (2021). What is open data? In Understanding data: Praxis and politics. HDI - Data, Praxis and Politics. https://doi.org/10.5281/zenodo.4783601
Data are characteristics or information, usually numerical, that are collected through observation. In a more technical sense, they are a set of values of qualitative or quantitative variables about one or more persons or objects, while a datum is a single value of a single variable. Data refers to raw, unprocessed facts, figures, or observations. These can take many forms—numbers, text, images, or measurements—but on their own, they lack meaning or context. Data is typically collected through observation, measurement, or recording, and represents the building blocks of knowledge.
Data are transformed into information when they are created, extracted, elaborated, and used with pre-established objectives. An information system—often made up of data of the same or different types (a data set is defined as a dataset)—is transformed into knowledge when it is interpreted through tools, applications, methods, indicators, and analytical frameworks.
Data can be small or big, private, personal, governmental, military, scientific, public, confidential, commercial, financial, or open. They normally pertain to information delivered in machine-readable file formats (machine-readable), in what is known as raw data. The most common formats include integer, floating-point number, character, string, and Boolean.
With the constant evolution of technology, the informational content and the data held by public administrations represent valuable opportunities to promote transparency in governmental action. Moreover, they can offer more efficient services and, since they facilitate reuse by other public and private actors, they can also be used in areas other than those for which they were originally produced or collected.
Knowledge, in practice, acquires the value of awareness—particularly in the case of open data, which can be defined as “collective,” understood as being for the common good (see the Open Data Handbook)—when it is used for change and the improvement of reality (facts).
flowchart TD
A[Data]
A --> B[Qualitative Data]
A --> C[Quantitative Data]
B --> B1["Nominal<br/>Categories without order"]
B --> B2["Ordinal<br/>Ordered categories"]
C --> C1["Discrete<br/>Countable values"]
C --> C2["Continuous<br/>Measurable values"]
%% Styling (pastel colors + black text)
style A fill:#FFD1DC,stroke:#333,stroke-width:2px,color:#000
style B fill:#CDE7FF,stroke:#333,color:#000
style C fill:#CDE7FF,stroke:#333,color:#000
style B1 fill:#E2F0CB,stroke:#333,color:#000
style B2 fill:#E2F0CB,stroke:#333,color:#000
style C1 fill:#FFF1B6,stroke:#333,color:#000
style C2 fill:#FFF1B6,stroke:#333,color:#000
Whilst data are features of information collected through observation, information is understood as a symbolic representation that describes facts, conditions, values, or situations, collected and arranged appropriately to fulfil the objective of the institution that manages it.
On their own, data lack semantic meaning—that is, they do not have meaning for someone and therefore do not add value to the recipient of the message. For data to make sense, they must be processed, associated, or grouped within the same context to form information.
In discussions about data literacy, it is essential to distinguish between data and information, as they are often used interchangeably despite representing different stages in the process of knowledge creation. Understanding this distinction helps academics and students better interpret, analyse, and communicate insights derived from datasets.
Information, on the other hand, emerges when data is processed, organised, and interpreted in a meaningful way. It provides context, relevance, and understanding, enabling individuals to draw conclusions, make decisions, and generate knowledge. In academic and research contexts, the transition from data to information is a critical step. It involves applying analytical methods, frameworks, and domain knowledge to transform raw inputs into meaningful outputs. This transformation is not neutral—it is shaped by interpretation, context, and purpose.
flowchart TD
A["Raw Data<br/>Numbers, Text, Observations"] --> B[Data Processing]
B --> C1[Organisation]
B --> C2[Filtering]
B --> C3[Structuring]
C1 --> D[Analysis]
C2 --> D
C3 --> D
D --> E[Interpretation]
E --> F["Information<br/>Meaningful Insights"]
F --> G1[Understanding]
F --> G2[Decision-Making]
F --> G3[Knowledge Creation]
%% Styling (pastel + black text)
classDef data fill:#e6f7ff,stroke:#444,color:#000;
classDef process fill:#fff5e6,stroke:#444,color:#000;
classDef analysis fill:#e6ffe6,stroke:#444,color:#000;
classDef output fill:#f3e6ff,stroke:#444,color:#000;
class A data;
class B,C1,C2,C3 process;
class D,E analysis;
class F,G1,G2,G3 output;
- Organisation: Structuring data into categories or formats
- Contextualisation: Adding meaning based on time, place, or purpose
- Analysis: Identifying patterns, relationships, or trends
- Interpretation: Drawing conclusions based on evidence