Handling data

krajcsi edited this page Nov 4, 2018 · 13 revisions

CogStat is designed not to edit data, but only to run the statistical analysis.

  • We suggest to use a spreadsheet program (e.g., LibreOffice Calc or Microsoft Excel) to store and edit data, and also to run basic analysis. Spreadsheet packages are very powerful for data manipulation and they are also surprisingly appropriate for simple (and sometimes for more complex) statistical analysis.
  • Also, you can use SPSS to store and manipulate your data.

Only ASCII characters (i.e., approximately, English letters) are supported at the moment in variable names and values (issue #10). Other characters may work in some cases, but it is better to avoid non-ASCII characters in your data.

CogStat handles string variables as nominal, even if other measurement level was set in your data source. So string variables will be set to nominal, no matter what measurement level was given formerly.

Store your data in SPSS

(New in v1.7) Simply open your SPSS .sav file in CogStat (e.g., drag and drop your file to the CogStat window) and start to analyze it.

Because CogStat relies on the measurement level of the variables to decide which analyses to run, it is essential to set the measurement levels of the variables and save them in SPSS before loading the data to CogStat.

This data format could be ideal if you mainly use SPSS, but want to use the analyses available in CogStat to make your work more efficient and precise.

Store your data in a spreadsheet software

There are two possibilities to import your data to CogStat:

  • Data can be imported with clipboard: copy the data in the spreadsheet program and paste it in CogStat.
    • Precision of the copied data depend on how the numbers were displayed in the spreadsheet, so change it according to your need. CogStat handles precision automatically after importing the data, e.g., means are displayed with the precision of the raw data.
  • Or you can save your spreadsheet data as .csv file, and open it in CogStat. (See below how to save your data as csv file.)
    • CogStat can open files with .txt, .log, .csv or .tsv extensions.
    • It is also possible to use any software that can export the data in those formats, such as statistical software packages, but with those software packages the measurement levels cannot be set (see more details below), so it is not recommended to use statistical software for this purpose.

How the data should look like in your spreadsheet software?

For example,

id Gender IQ
nom nom int
lcf 1 96
gok 1 121
tf 2 118
trs 1 128
rs 2 99
  • Like in any statistical software, rows should be the cases, and columns should be the variables.
  • First row should include the names.
    • Variables with missing names will be named by CogStat as Unnamed: 0, Unnamed: 1, etc.
    • If you forget to include the whole variable names row, the entire first row (first case in your data) will be handled by CogStat (erroneously) as names.
    • Different variables cannot have the same names. If variables with the same names are found, then variables with a name already in use will be renamed by CogStat.
  • Second row includes the measurement level of the variables.
    • Use nom, ord and int for nominal, ordinal and interval variables. If any other word is in that row, then CogStat considers the row as values, and measurement levels will not be recognized.
    • Although technically this is optional, this information is essential in your data, as CogStat uses it for the automatic selection of calculations.
      • If measurement level row is not given, then CogStat sets the variables as unk (unknown). Mostly these variables will be handled as interval variables. Thus, if you have ordinal or nominal numerical variable, set the measurement levels, otherwise incorrect results will be calculated.
  • All other rows include values of the data.
    • For missing value leave the cell empty, or write nan.

How to save spreadsheets as csv files?

In the csv file the cells should be separated by tabs, and the decimal sign is dot.

  • In LibreOffice Calc
    • File > Save as...
    • Format: Text CSV (.csv), Edit filter settings on
    • Field delimiter: {Tab}, Text delimiter: "
  • In Microsoft Excel
    • File > Save as > Text (Tab delimited)
  • In Google Spreadsheet
    • File > Download as > Plain text
  • In Gnumeric
    • File > Save as
    • File Type: Text (configurable)
    • Separator: Tab, Quote character: "
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.