# JSON and API

## What is JSON?

`JSON` (`JavaScript Object Notation`) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language Standard.

`API` is the acronym for `Application Programming Interface`, which is a software intermediary that allows two applications to talk to each other. 

One of the most popular packages for `json` is `jsonlite`.

In [151]:
#install.packages("jsonlite")
library(jsonlite)

Let's use readinginformation about BTC and USDT crypro currencies from Binance

In [152]:
market = 'BTCUSDT'
interval = '1h'
limit = 100

url <- paste0(url = "https://api.binance.com/api/v3/klines?symbol=", market ,"&interval=", interval,"&limit=", limit)
print(url) # complete request URL

[1] "https://api.binance.com/api/v3/klines?symbol=BTCUSDT&interval=1h&limit=100"


On the next stage you need use fromJSON() function to get data.

More details about requests to Binanace at https://github.com/binance/binance-spot-api-docs/blob/master/rest-api.md#klinecandlestick-data

If you enter 'url' value at browser response is going to be like this:

```json
[
  [
    1499040000000,      // Open time
    "0.01634790",       // Open
    "0.80000000",       // High
    "0.01575800",       // Low
    "0.01577100",       // Close
    "148976.11427815",  // Volume
    1499644799999,      // Close time
    "2434.19055334",    // Quote asset volume
    308,                // Number of trades
    "1756.87402397",    // Taker buy base asset volume
    "28.46694368",      // Taker buy quote asset volume
    "17928899.62484339" // Ignore.
  ]
]
```

In [153]:
data <- fromJSON(url) # get json and transform it to list()
data <- data[, 1:7] # let's left only 1:7 columns (from Open time to Close time)
head(data)

0,1,2,3,4,5,6
1650513600000.0,41693.58,41750.0,41525.0,41610.01,1138.64337,1650517199999
1650517200000.0,41610.01,41699.0,41434.44,41462.76,1229.25936,1650520799999
1650520800000.0,41462.75,41600.0,41419.2,41522.38,1049.71244,1650524399999
1650524400000.0,41522.38,41940.0,41451.0,41855.69,1928.48091,1650527999999
1650528000000.0,41855.69,42050.3,41741.1,41922.97,2518.0409,1650531599999
1650531600000.0,41922.96,41971.9,41743.96,41803.7,1655.76993,1650535199999


In [154]:
typeof(data) # check data type
data <- as.data.frame(data) # convert to dataframe
head(data)

Unnamed: 0_level_0,V1,V2,V3,V4,V5,V6,V7
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
1,1650513600000.0,41693.58,41750.0,41525.0,41610.01,1138.64337,1650517199999
2,1650517200000.0,41610.01,41699.0,41434.44,41462.76,1229.25936,1650520799999
3,1650520800000.0,41462.75,41600.0,41419.2,41522.38,1049.71244,1650524399999
4,1650524400000.0,41522.38,41940.0,41451.0,41855.69,1928.48091,1650527999999
5,1650528000000.0,41855.69,42050.3,41741.1,41922.97,2518.0409,1650531599999
6,1650531600000.0,41922.96,41971.9,41743.96,41803.7,1655.76993,1650535199999


In [155]:
# fix columns names
colnames(data) <- c("Open_time", "Open", "High", "Low", "Close", "Volume", "Close_time")
head(data) # looks better, but columns are characters still

Unnamed: 0_level_0,Open_time,Open,High,Low,Close,Volume,Close_time
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
1,1650513600000.0,41693.58,41750.0,41525.0,41610.01,1138.64337,1650517199999
2,1650517200000.0,41610.01,41699.0,41434.44,41462.76,1229.25936,1650520799999
3,1650520800000.0,41462.75,41600.0,41419.2,41522.38,1049.71244,1650524399999
4,1650524400000.0,41522.38,41940.0,41451.0,41855.69,1928.48091,1650527999999
5,1650528000000.0,41855.69,42050.3,41741.1,41922.97,2518.0409,1650531599999
6,1650531600000.0,41922.96,41971.9,41743.96,41803.7,1655.76993,1650535199999


In [156]:
is.numeric(data[,1]) # check 1st column type is numeric
is.numeric(data[,2]) # check 2nd column type is numeric

In [157]:
data <- as.data.frame(sapply(data, as.numeric)) # convert all columns to numeric
head(data) # good, its double now

Unnamed: 0_level_0,Open_time,Open,High,Low,Close,Volume,Close_time
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,1650514000000.0,41693.58,41750.0,41525.0,41610.01,1138.643,1650517000000.0
2,1650517000000.0,41610.01,41699.0,41434.44,41462.76,1229.259,1650521000000.0
3,1650521000000.0,41462.75,41600.0,41419.2,41522.38,1049.712,1650524000000.0
4,1650524000000.0,41522.38,41940.0,41451.0,41855.69,1928.481,1650528000000.0
5,1650528000000.0,41855.69,42050.3,41741.1,41922.97,2518.041,1650532000000.0
6,1650532000000.0,41922.96,41971.9,41743.96,41803.7,1655.77,1650535000000.0


Final stage is to convert `Open_time` and `Close_time` to dates.

In [158]:
data$Open_time <- as.POSIXct(data$Open_time/1e3, origin = '1970-01-01')
data$Close_time <- as.POSIXct(data$Close_time/1e3, origin = '1970-01-01')

head(data) 

Unnamed: 0_level_0,Open_time,Open,High,Low,Close,Volume,Close_time
Unnamed: 0_level_1,<dttm>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dttm>
1,2022-04-21 07:00:00,41693.58,41750.0,41525.0,41610.01,1138.643,2022-04-21 07:59:59
2,2022-04-21 08:00:00,41610.01,41699.0,41434.44,41462.76,1229.259,2022-04-21 08:59:59
3,2022-04-21 09:00:00,41462.75,41600.0,41419.2,41522.38,1049.712,2022-04-21 09:59:59
4,2022-04-21 10:00:00,41522.38,41940.0,41451.0,41855.69,1928.481,2022-04-21 10:59:59
5,2022-04-21 11:00:00,41855.69,42050.3,41741.1,41922.97,2518.041,2022-04-21 11:59:59
6,2022-04-21 12:00:00,41922.96,41971.9,41743.96,41803.7,1655.77,2022-04-21 12:59:59


In [159]:
tail(data) # check last records

Unnamed: 0_level_0,Open_time,Open,High,Low,Close,Volume,Close_time
Unnamed: 0_level_1,<dttm>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dttm>
95,2022-04-25 05:00:00,39095.81,39153.94,38961.64,39091.17,1205.5158,2022-04-25 05:59:59
96,2022-04-25 06:00:00,39091.17,39294.76,39086.37,39253.71,1443.3318,2022-04-25 06:59:59
97,2022-04-25 07:00:00,39253.7,39256.28,39055.71,39139.74,896.8554,2022-04-25 07:59:59
98,2022-04-25 08:00:00,39139.74,39230.5,38947.42,38975.22,1057.49,2022-04-25 08:59:59
99,2022-04-25 09:00:00,38975.21,39057.97,38590.0,38636.35,2814.9716,2022-04-25 09:59:59
100,2022-04-25 10:00:00,38636.35,38675.68,38200.0,38534.99,3528.2355,2022-04-25 10:59:59


## Набори даних

1. https://github.com/kleban/r-book-published/tree/main/datasets/telecom_users.csv
2. https://github.com/kleban/r-book-published/tree/main/datasets/telecom_sers.xlsx
3. https://github.com/kleban/r-book-published/tree/main/datasets/Default_Fin.csv
4. https://github.com/kleban/r-book-published/tree/main/datasets/employes.xml

---

## References

1. [SQLite in R. Datacamp](https://www.datacamp.com/community/tutorials/sqlite-in-r)
2. [Tidyverse googlesheets4 0.2.0](https://www.tidyverse.org/blog/2020/05/googlesheets4-0-2-0/)
<!-- 3. [Telecom users dataset. Practice classification with a telco dataset.Kaggle](https://www.kaggle.com/radmirzosimov/telecom-users-dataset) -->
4. [Binanace spot Api Docs](https://github.com/binance/binance-spot-api-docs/blob/master/rest-api.md#klinecandlestick-data)
5. [Web Scraping in R: rvest Tutorial](https://www.datacamp.com/community/tutorials/r-web-scraping-rvest) by Arvid Kingl