### Numeric datatypes

In [1]:
0x42 / byte   , 1b
42h  / short  , 2b
42i  / int    , 4b
42j  / long   , 8b
42e  / real   , 4b
42.0 / float  , 8b


0x42


42h


42i


42


42e


42f


**Equality**

In [2]:
42j=42i

1b


**Identity**

In [3]:
42j~42i

0b


#### Float

The **float** type represents an IEEE standard eight-byte floating-point number, often called "double" in traditional languages. A float can hold (at least) 15 decimal digits of precision.

In [4]:
3.14~3.14f

1b


#### Real

The **real** type represents a single-precision, four-byte floating-point number and is denoted by numeric digits containing a decimal point and a trailing type indicator e. Be mindful that this type is called 'float' in some languages. A real can hold at least 6 decimal digits of precision.

In [5]:
3.14e
3.14e=3.14f
3.14e~3.14f

3.14e


0b


0b


#### Accuracy

Change accuracy using the \P command (note upper case) to specify a display width up to 16 digits

In [6]:
\P 8
1%3

0.33333333


In [7]:
\P 2
1%3

0.33


#### 2.3.1 Boolean

The boolean type uses one byte to store a bit and is denoted by the bit value with the trailing type indicator b. There are no keywords for 'true' or 'false', nor are there separate logical operators for booleans.

In [8]:
1b
0b

1b


0b


The ability of booleans to participate in arithmetic can be useful in eliminating conditionals.

In [9]:
flag:1b
base:100
base+flag*42

142


#### 2.3.3 GUID

A GUID (globally unique identifier) is a 16-byte binary value that is unique across time and space (well, nearly so). It is ideally suited for locally generating a globally unique identifier without resorting to a central control mechanism – e.g., transaction IDs. It can be used as a table key or in joins and is preferred to strings or symbols in such situations.

The guid type does not have a literal form since it is generated for you by a process that guarantees uniqueness. Applying ? to the null guid value 0Ng generates a list of guids.

In [10]:
5?0Ng
-2?0Ng

8c6b8b64-6815-6084-0a3e-178401251b68 5ae7962d-49f2-404d-5aec-f7c8abbae288 5a5..


ef8d8cd4-59eb-7d68-cb08-2ba314829457 aa20bf87-dad4-be4e-1829-8178f6a15429


The difference between using a positive integer vs. a negative integer to generate a list of GUIDs is that the positive case uses the same initial seed in each new q session whereas the negative case uses a random seed. The former is useful for reproducible results during testing but only the latter should be used in production; otherwise, your "GUIDs" will not be unique across q sessions.

### 2.4 Text Data

#### 2.4.1 Char

A char holds an individual ASCII or 8-bit Unicode character that is stored in one byte. It corresponds to a SQL CHAR. It is denoted by a single character enclosed in double quotes.

In [11]:
"a"

"a"


In [12]:
"\n"

"\n"


#### 2.4.2 Symbol

A symbol is akin to a SQL VARCHAR, in that it can hold an arbitrary number of characters, but is different in that it is atomic. The char "q" and the symbol `kdb are both atomic entities. A symbol is irreducible, meaning that the individual characters that comprise it are not directly accessible.

A symbol is not a string. We shall see in Chapter 3 that there is an analogue of strings in q, namely a list of char. While a list of char is a kissing cousin to a symbol, we emphasize that a symbol is not a collection of char. The symbol `a and the char "a" are not the same, as we can see by asking q if they are identical.

In [13]:
`symbol

`symbol


Symbols are used for names in q. All names are symbols but not all symbols are names.

### 2.5 Temporal Data

#### 2.5.1 date

A date is stored as a four-byte signed integer and is denoted by yyyy.mm.dd, where yyyy represents the year, mm the month and dd the day. The underlying value is the count of days from Jan 1, 2000 – positive for post-millennium and negative for pre.

In [14]:
2000.01.01

2000.01.01


In [15]:
2000.01.01=0
2000.01.01~0

1b


0b


In [16]:
1999.12.31=-1
2000.01.02=1

1b


1b


Since real-world months and days begin at 1 (not zero), January is 01. Leading zeroes in months and days are required; their omission causes an error.

The underlying day count can be obtained by casting.

In [24]:
`int$2000.02.01

31i


### 2.5.2 Time Types

#### Time

There are two versions of time, depending on the resolution required. If **milliseconds** are sufficient, use the **time** type, which stores the count of milliseconds from midnight in a 32-bit signed integer. It is denoted by **hh:mm:ss.uuu** where hh represents hours on the 24-hour clock, mm represents minutes, ss represents seconds, and uuu represents milliseconds.

In [25]:
12:34:56.789
12:00:00.000=12*60*60*1000

12:34:56.789


1b


Leading zeroes are required in all constituents of a time value. The underlying millisecond count can be obtained by casting to an int.

In [26]:
`int$12:00:00.000

43200000i


#### Timespan

If milliseconds are not sufficient, use the **timespan** type, which stores the count of **nanoseconds** from midnight as a long integer.

It is denoted by **0Dhh:mm:ss.nnnnnnnnn** where hh represents hours on the 24-hour clock, mm represents minutes, ss represents seconds, and nnnnnnnnn represents nanoseconds. Observe that the leading 0D is optional.

In [27]:
12:34:56.123456789
12:34:56.123456 / microseconds become nanos

0D12:34:56.123456789


0D12:34:56.123456000


Leading zeroes in constituents are again required.

The underlying nanosecond count can be obtained by casting to a long.

In [28]:
`long$12:34:56.123456789

45296123456789


### 2.5.3 Date-Time Types

#### Datetime ( deprecated )

There are two date-time types. The first is deprecated and should not be used; we include it here in case you encounter it in older q code.

**A datetime (deprecated)** is the lexical combination of a date and a time, separated by T as in the ISO standard format. A datetime value stores in a **float the fractional day count from midnight Jan 1, 2000.**

In [29]:
2000.01.01T12:00:00.000
2000.01.02T12:00:00.000=1.5
`float$2000.01.02T12:00:00.000
`date$2000.01.02T12:00:00.000
`time$2000.01.02T12:00:00.000

2000.01.01T12:00:00.000


1b


1.5


2000.01.02


12:00:00.000


#### Timestamp

The preferred type is **timestamp**, which is the lexical combination of a date and a timespan, separated by D. The underlying timestamp value is a **long representing the count of nanoseconds since the millennium**. Post-millennium is positive and pre- is negative.

In [30]:
2014.11.22D17:43:40.123456789
`long$2014.11.22D17:43:40.123456789
`date$2014.11.22D17:43:40.123456789
`time$2014.11.22D17:43:40.123456789
`timespan$2014.11.22D17:43:40.123456789

2014.11.22D17:43:40.123456789


469993420123456789


2014.11.22


17:43:40.123


0D17:43:40.123456789


Use a timestamp instead of a datetime for a key column or in a join. Or separate into date and time columns.

#### 2.5.4 month

The month type is stored as a 32-bit signed integer and is denoted by **yyyy.mm with a trailing type indicator m**. A month value is the count of months since the beginning of the millennium. Post-milieu is positive and pre is negative.

In [37]:
2015.11m
2000.01m=0 / starts with 0
2000.09m=8
2014.11 / this is a float!
`int$2015.01m / underlying month count 
2015.07m=2015.07.01 /  the first day of the month is equal to the month

2015.11m


1b


1b


2e+003


180i


1b


#### 2.5.5 minute

The minute type is stored as a 32-bit signed integer and is denoted by **hh:mm**. A minute value counts the number of minutes from midnight.

In [41]:
12:30
12:30=30+60*12
`int$12:00 / The underlying minute count can be obtained by casting to int.
12:00=12:00:00.000
12:00=12:00:00.000000000

12:30


1b


720i


1b


1b


#### 2.5.6 second

The second type is stored as 32-bit signed integer and is denoted by **hh:mm:ss**. A second value counts the number of seconds from midnight.

In [42]:
23:59:59
23:59:59=(24*60*60)-1

23:59:59


1b


In [45]:
`int$12:34:56            / second:   number of seconds
`int$12:34:56.000        / time:     number of milliseconds
`long$12:34:56.000000000 / timespan: number of nanoseconds

45296i


45296000i


45296000000000


Nevertheless, these values are equal in the eyes of q – as they should be, since they are merely representations in different units of the same position on a clock.

In [47]:
12:34:56=12:34:56.000
12:34:56.000=12:34:56.000000000
12:34:56=12:34:56.000000000

1b


1b


1b


#### 2.5.7 Constituents and Dot Notation

The constituents of compound temporal types can be extracted using dot notation. For example, the field values of a date are named year, mm and dd; similarly for time and other temporal types.

In [50]:
dt:2018.09.18
dt.year
dt.mm
dt.dd

2018i


9i


18i


In [54]:
ti:12:34:56.789
ti.hh
ti.mm
ti.ss

12i


34i


56i


Unfortunately, at the time of this writing (Sep 2015) dot notation for extraction (still) does not work inside functions.
Thus we recommend avoiding dot notation altogether and using cast instead, as it always works for any meaningful temporal extraction or conversion. In addition to the individual field values, you can also extract higher-order constituents.

In [55]:
`year$dt
`mm$dt
`dd$dt
`month$dt

2018i


9i


18i


2018.09m


To extract milliseconds or nanoseconds from a time type, cast to the underlying integer and mod the result by 1000 or 1000000000.

In [56]:
(`int$12:34:56.789) mod 1000
(`long$12:34:56.123456789) mod 1000000000

789


123456789


### 2.6 Arithmetic Infinities and Nulls

In [57]:
0w  / Positive float infinity
-0w / Negative float infinity
0n  / Null float ; NaN, or not a number
0W  / Positive long infinity
-0W / Negative long infinity
0N  / Null long

0w


-0w


0n


0W


-0W


0N


In [61]:
4%2 / In q, division of numeric values always results in a float.
1%0
-1%0
0%0

2f


0w


-0w


0n


The q philosophy is that any valid arithmetic expression will produce a result rather than a runtime error. Therefore, dividing by 0 produces a special float value rather than throwing an exception. You can perform a complex sequence of calculations without worrying about things blowing up in the middle or having to insert cumbersome exception trapping.


0N   / 	MIN_INT	     -9223372036854775808
-0W  / 	MIN_INT+1	 -9223372036854775807
0W   / 	MAX_INT	     +9223372036854775807

Consequently, ordering on integers is,
0N < -0W < normal integer < 0W

In [64]:
0N<-0W
-0W<42
42<0W

1b


1b


1b


### 2.7 Nulls

The q situation is more interesting. There are no references or pointers, so the notion of an unallocated entity does not arise. Most types have null values that are distinct from "normal" values and occupy the same amount of storage. **Some types do not designate a distinct null value because there is no available bit pattern – i.e., for boolean, byte and char all underlying bit patterns are meaningfully employed. In this case, the value with no information content serves as a proxy for null.**

**Testing for Null**
You could test for null using = but this requires a null literal of correct type. Because q is dynamically typed, this can result in problems if a variable changes type during program execution.

Always use the monadic null to test a value for null, as opposed to =, as it provides a type-independent check. Also, you don't have to remember the funky null literals.

In [69]:
null 0b
null `
null " "
null ""
null 0n


0b


1b


1b


`boolean$()


1b
