In [1]:
#r "nuget: FsHttp"
open FsHttp

# Tips and tricks for using Web APIs in FSharp

_A broad introduction on web APIs and tips and tricks for using them in FSharp._

## Web APIs and their importance in modern (web) development

A web API (**A**pplication **P**rogramming **I**nterface) is a way for computers to communicate with each other over the internet. 

It is a set of rules that defines how computers should communicate with each other:

the **server**:
  - provides and defines the API
  - defines a set of rules for the **client** to follow

the **client** 
  - has to follow the rules provided for the API.
  - can be any program that can communicate over the internet (e.g. a web browser, a mobile app, a desktop app, a server, a script, a **notebook**...)

A web API is usually used to **request** data from a server. The server will then **respond** with the requested data.

## Understanding the HTTP Protocol

_See also: https://en.wikipedia.org/wiki/HTTP_

The Hypertext Transfer Protocol (HTTP) is the foundation of data communication for the World Wide Web. It is a set of rules that defines how computers should communicate with each other over the internet.

HTTP functions as a request–response protocol between a **client**-**server** pair. 

The client submits an HTTP request message to the server. Requests contain the [request method](#http-request-methods) and are performed on a resource provided by the server via an URL (Uniform Resource Locator). It can also contain additional information in the form of [headers](#http-headers) and a message body.

The server, returns a response message to the client after processing the request. The response contains completion status information about the request and may also contain requested content in its message body.

### Http Request methods

HTTP defines methods (sometimes referred to as verbs) to indicate the desired action to be performed on the identified resource. What this resource represents, whether pre-existing data or data that is generated dynamically, depends on the implementation of the server. Often, the resource corresponds to a file or the output of an executable residing on the server.

| Method | Description | Request message has body content | Response message has body content |
| --- | --- | --- | --- |
| GET | Requests a representation of the specified resource. Requests using GET should only retrieve data. | Optional | Yes |
| HEAD | Asks for a response identical to that of a GET request, but without the response body. | Optional | No |
| POST | Submits an entity to the specified resource, often causing a change in state or side effects on the server. | Yes | Yes |
| PUT | Replaces all current representations of the target resource with the request payload. | Yes | Yes |
| DELETE | Deletes the specified resource. | Optional | Yes |
| CONNECT | Establishes a tunnel to the server identified by the target resource. | Optional | Yes |
| OPTIONS | Describes the communication options for the target resource. | Optional | Yes |
| TRACE | Performs a message loop-back test, meaning that the server should return the received request as payload. | No | Yes |
| PATCH | Applies partial modifications to a resource. | Yes | Yes |

The methods most often encountered are **GET** and **POST** requests, for accessing and sending data respectively.

### HTTP Headers

HTTP header fields are a list of strings sent and received by both the client program and server on every HTTP request and response. 

They are basically metadata about the request/response and can contain information about the client, the server, the body content, the request itself, the response itself, etc.

### Query parameters

Query parameters are not part of the HTTP protocol, but are often used in web APIs. They are a way to pass additional information to the server in the URL.

They are usually added to the URL after a question mark `?` and are separated by an ampersand `&`.

Example: `https://example.com/path/to/resource?param1=value1&param2=value2`, will pass the parameters `param1` and `param2` with the values `value1` and `value2` to the server.

Since query parameters are not part of the HTTP protocol, the API must define the set of parameters that are accepted by the server, meaning you cannot expect that every API will accept the same parameters.

### Request Example

In the end, a HTTP request is just a string that follows a certain format:

```
<request line>
<headers>

[<message body>]
```


So a **GET** request for the image `logo.png` on the website `www.example.com` located at `/images` can for example look like this:

```
GET /images/logo.png HTTP/1.1 (request line)
Host: www.example.com (header 1)
Accept-Language: en (header 2)

[<message body>](optional, any content)
```

### HTTP response codes and their meanings

Status codes are issued by a server in response to a client's request made to the server. All HTTP response status codes are separated into five classes or categories. The first digit of the status code defines the class of response, while the last two digits do not have any classifying or categorization role:

- `1xx informational response` – the request was received, continuing process
- `2xx successful` – the request was successfully received, understood, and accepted
- `3xx redirection` – further action needs to be taken in order to complete the request
- `4xx client error` – the request contains bad syntax or cannot be fulfilled
- `5xx server error` – the server failed to fulfil an apparently valid request

A few common, official status codes are:

- `200 OK` – Standard response for successful HTTP requests.
- `301 Moved Permanently` – This and all future requests should be directed to the given URI.
- `404 Not Found` – The requested resource could not be found but may be available in the future.
- `500 Internal Server Error` – A generic error message, given when an unexpected condition was encountered and no more specific message is suitable.
- `503 Service Unavailable` – The server is currently unavailable (because it is overloaded or down for maintenance). Generally, this is a temporary state.
  
Not all of the respective 99 status codes of each category have official meanings. There is room in the HTTP status code numbering space for developers to define their own custom HTTP status codes for their applications.


### Common Response message formats

The format of message body content is usually defined by the API. The most common formats are:
- **HTML** (Hypertext Markup Language) - used for web pages
- [**JSON**](#working-with-json-format) (JavaScript Object Notation) - used for data transfer
- **XML** (Extensible Markup Language) - used for data transfer



### Response Example

Just as the request, the response is also just a string that follows a certain format:

``` 
<status line>
<headers>

[<message body>]
```

So a response for the request above could look like this:

```
HTTP/1.1 200 OK (status line)
Content-Type: image/jpeg  (header 1)

[<message body>](contains the image data)
```

## Working with JSON Format

### JSON structure and syntax

JSON (**J**ava**S**cript **O**bject **Notation**) is a lightweight, text-based data-interchange format. It is inspired by JavaScript object literal syntax.

JSON data is organized as key-value pairs, where each key(property) is a string enclosed in double quotes, followed by a colon, and then the corresponding value. The value can be any `JSON data type`:

- `String`: Enclosed in double quotes, represent a sequence of characters.
    Example: 
    ```json
    "key": "Hello, World!"
    ```

- `Number`: Represents numeric values, including integers and floating-point numbers.
    Example: 
    ```json
    "key": 42
    "key2": 69.420
    ```

- `Boolean`: Represents either true or false.
    Example: 
    ```json
    "key": true
    ```

- `Null`: Represents the absence of a value.
    Example: 
    ```json
    "key": null
    ```

- `Array`: Ordered list of values enclosed in square brackets ([]), separated by commas.
    Example: 
    ```json
    "key": [1, 2, 3, 4]
    ```

- `Object`: Unordered collections of key-value pairs enclosed in curly braces ({}), separated by commas.
    Example: 
    ```json
    {"name": "John", "age": 30}
    ```

When used as data exchange format, all key-value pairs are usually put onto a `root object` that has no key:

```json
{
    "key": "value",
    "key2": 42,
    "key3": true,
    "key4": null,
    "key5": [1, 2, 3, 4],
    "key6": {"name": "John", "age": 30}
}
```

### JSON serialization and deserialization in FSharp

One of the easiest ways of working with Json in F# is to use `System.Text.Json`, which is part of .NET. There are certain extensions for full idiomatic F# support, but they are not necessary for the examples in this post.

There are a few key data types in `System.Text.Json` that we will use shortly:
- `JsonDocument`: a in-memory Document Object Model (DOM) representation of the JSON object 
- `JsonElement`: a single value of a property. Can be of any JSON data type
  - can be (tried to) cast to an actual value, e.g. by `.GetInt32()` method
  - can be (tried to) deserialized into any type using the `.Deserialize<'T>` method

Here is a little picture to visualize where these types fit in:

![System.Text.Json](../../img/webapis/stj.png)

#### Dynamic lookup

FsHttp includes a [dynamic lookup operator](https://github.com/fsprojects/FsHttp/blob/5d587f0ef25d47df0089268ed3887d47250ffffb/src/FsHttp/Operators.fs#L5-L7) for `JsonElement`, which is very handy for exploring JSON without knowing the exact structure.

Dynamic lookup on JSON elements is done using the `?` operator:

In [40]:
open System.Text.Json
open FsHttp

let jsonString = """{"FirstName": "Kevin", "LastName": "Schneider"}"""

let jDoc = JsonDocument.Parse(jsonString).RootElement

jElement?FirstName

#### Strongly typed deserialization

We can also deserialize JSON into a strongly typed F# record. This makes most sense when we know how the JSON looks like.

First, we need to define a record type to map our JSON to:

In [3]:
type Person = {
    FirstName: string
    LastName: string
}

Which we can then deserialize the JSON into using `JsonSerializer.Deserialize`:

In [42]:
let typedJson = JsonSerializer.Deserialize<Person>(jsonString)

typedJson

jElement.Deserialize<Person>()

Unnamed: 0,Unnamed: 1
FirstName,Kevin
LastName,Schneider


#### Combining dynamic and strongly typed deserialization

In some cases, we only want to deserialize a single property of a JSON object into a record, ignoring the other fields. We can do this by using a combination of dynamic and strongly typed deserialization. 

Consider we have this JSON string:

```json
{
    "uninteresting_element": {"bla": "bla" },
    "interesting_element": {"this": "is interesting"}
}
```

where we only want to deserialize the `interesting_element` property into a record.

We can access the `interesting_element` property using dynamic lookup, and then deserialize it into a record like this:

In [43]:
let json_string = """{
    "uninteresting_element": {"bla": "bla" },
    "interesting_element": {"this": "is interesting"}
}"""

type InterestingElement = {this: string}

JsonDocument.Parse(json_string)
    .RootElement?interesting_element
    .Deserialize<InterestingElement>()

Unnamed: 0,Unnamed: 1
this,is interesting


## Making HTTP Requests in FSharp

### FsHttp

[FsHttp](https://fsprojects.github.io/FsHttp/) is a .Net HTTP client library for C# and F#. It aims for describing and executing HTTP requests in convenient ways, while also providing functions for handling responses.

For simple demonstration purposes in this section, i will use the [reqres](https://reqres.in) API, which is a free API for testing HTTP requests and responses. Swagger documentation of the API can be found here: https://reqres.in/api-docs/

Let's start with a simple get request to the `users` endpoint of the reqres API. The respective documentation can be found here: https://reqres.in/api-docs/#/default/get_users

![](../../img/webapis/users-endpoint.png)

The documentation shows us that
  - the endpoint is `https://reqres.in/api/users`
  - the request method is `GET`
  - the endpoint accepts 2 parameters: `page` and `per_page`
  - the response is a JSON object with a `data` property, which is an array of `user` objects:
  ```json
    {
      "page": 0,
      "per_page": 0,
      "total": 0,
      "total_pages": 0,
      "data": [
          {
            "id": 0,
            "email": "string",
            "first_name": "string",
            "last_name": "string",
            "avatar": "string"
        }
      ]
    }
  ```

Let's perform a request to this endpoint using FsHttp.

in general, requests are composed using the `http` computation expression, which we then send to the server using `Request.send`. This will return a `Response` object, which has quite a lot of useful information about the response, such as the status code, the headers, the body, etc, but is quite complicated. Since we already know that the API will return a JSON object, we can use `Response.toJson`, which will automatically deserialize the response body into a `JsonElement` object.


In [47]:
open FsHttp

let users_list = 
    http {
        GET "https://reqres.in/api/users"
    }
    |> Request.send
    |> Response.toJson

users_list

Wen can use the dynamic `?` operator to take a look at some JSON properties in the response. Since the API docs tell us that the `data` property will contain the users, lets take a look at it:

In [48]:
users_list?data

### Query parameters in FsHttp

For a quick look at how to use url query parameters with FsHttp, let's use the `users` endpoint again, but this time with 1 user per page, looking at the second page.

'page' in this context means that the server will serve us the results in chunks (pages), a concept called [pagination](#pagination) that will become important later as well.

In [50]:
http {
    GET "https://reqres.in/api/users"
    query [
        "page", 2
        "per_page", 1
    ]
}
|> Request.send
|> Response.toJson

Lets now look at the other fields in the response:
- `page` tells us which chunk of the result we are looking at
- `per_page` tells us how many results are on each page
- `total` tells us how many results there are in total
- `total_pages` tells us how many pages there are in total

with this information, we can navigate the results. wen can access the last result without ever requesting the first page, for example:

In [9]:
http {
    GET "https://reqres.in/api/users"
    query [
        "page", 12
        "per_page", 1
    ]
}
|> Request.send
|> Response.toJson
|> fun json -> json?data

### Parsing and processing JSON response data

So far, we have only looked at the JSON response data using the dynamic `?` operator. This is useful for exploring the JSON structure, but not for actually processing the data. For this, we need to deserialize the JSON into a strongly typed F# record:

In [51]:
type User = {
    id: int
    email: string
    first_name: string
    last_name: string
    avatar: string
}

let users_list = 
    http {
        GET "https://reqres.in/api/users"
    }
    |> Request.send
    |> Response.toJson
    |> fun json -> json?data.Deserialize<User list>()

We can now for example write a function that creates a little thumbnail for a user:

In [11]:
let createUserThumbnail (u: User) =
    $"""
    <div> 
        <h1>{u.first_name} {u.last_name}</h1>
        <img src="{u.avatar}" />
        <br>
        <a href="mailto:{u.email}">{u.email}</a>
    </div>
    """

users_list
|> List.head
|> createUserThumbnail
|> DisplayFunctions.HTML

## Other common API concepts

The following loosely grouped concepts are also often encountered in web APIs mostly to prevent servers from being overloaded.

### Rate Limiting

No matter what type of request, the server will have to use resources to fulfill it. Many APIs have rate limits, limiting the amount of requests a client can make in a certain time period to prevent clients from overloading the server with requests.

Many APIs offer a way for registered users to increase their rate limits by authenticating themselves. This is usually done by sending an authentication token with the request, which the server can use to identify the user and increase the rate limit.

As an example, GitHub's search API will rate limit unauthenticated users at 10 requests/minute, while authenticated users can make up to 30 requests/minute.

Let's see that in action:

In [54]:
let simpleRepoQuery () = 
    http {
        GET "https://api.github.com/search/repositories"
        query [
            "q", "repo:CSBiology/CSBlog"
        ]
        UserAgent "fshttp"
    }
    |> Request.send
    |> Response.toJson

simpleRepoQuery()?items.EnumerateArray() 
|> Seq.cast<JsonElement>
|> Seq.item 0
|> fun json -> json?full_name, json?description

Unnamed: 0,Unnamed: 1
Item1,"""CSBiology/CSBlog"""
Item2,"""The blog about all things CSB"""


In [55]:
for i in 1..10 do simpleRepoQuery() |> ignore

simpleRepoQuery()?message

### Pagination

Many APIs will return a large amount of data, which can be problematic for performance reasons. Imagine a request that returns 1000 results, each containing a large amount of data. This would take a long time to process, require a lot of memory, and a very stable internet connection.

For this reason, many APIs will split the results into multiple chunks (pages), which we can then request individually.

We have seen this concept already in the [query parameters in FsHttp](#query-parameters-in-fshttp) section, where we used the `page` and `per_page` parameters to navigate the results, i just wanted to mention it again with the proper name here.

### Result Limiting

Similar to rate limiting, many APIs will also limit the amount of results that can be requested in a request. There are multiple ways to implement this:
- **limit the amount of results per page**. For example, the GitHub repository search API will only return up to 100 results per page, even if there are more results.
- **absolutely limit the amount of results** a request can produce. For example, the GitHub search API will only return up to 10000 pageinated results per query, even if there are more results.

## Request Batching

All of the [limiting concepts](#other-common-api-concepts) are there for a reason, and we as clients must work with them.

- **Cleverly batching and combine requests:**
    For example aggregate request results that would not have worked as single requests, e.g. by adding filters for date ranges: Batching requests to get the results for each month of the last year, and then combining these results to get **ALL** results for a year, which would not have been possible with a single request.
- **Make sure to authorize to the server** 
    Let's get to the maximum rate limit. If the API has an increase for authenticated users, lets authenticate.
- **Make sure to wait a certain amount of seconds between requests to not hit rate limits**
- **Save results of chunked requests**. There is nothing more frustrating than running 100 chunks of requests, and encountering an error in the last 10 which leads to the script just exiting with an error.



## Case study: retrieving all Pubmed 

In [63]:
http {
    GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
    query [
        "db", "pubmed"
        "term", "arabidopsis thaliana"
        "retmode", "json"
        "retmax", 9999
        "retstart", 9999
    ]
}
|> Request.send
|> Response.toJson

Error: System.Exception: Could not parse JSON: One or more errors occurred. ('0x0A' is invalid within a JSON string. The string should be correctly escaped. LineNumber: 0 | BytePositionInLine: 104.)
Content:
{"header":{"type":"esearch","version":"0.3"},"esearchresult":{"ERROR":"Search Backend failed: Exception:
'retstart' cannot be larger than 9998. For PubMed, ESearch can only retrieve the first 9,999 records matching the query. To obtain more than 9,999 PubMed records, consider using EDirect that contains additional logic to batch PubMed search results automatically so that an arbitrary number can be retrieved. For details see https://www.ncbi.nlm.nih.gov/books/NBK25499/"
 ---> System.AggregateException: One or more errors occurred. ('0x0A' is invalid within a JSON string. The string should be correctly escaped. LineNumber: 0 | BytePositionInLine: 104.)
 ---> System.Text.Json.JsonReaderException: '0x0A' is invalid within a JSON string. The string should be correctly escaped. LineNumber: 0 | BytePositionInLine: 104.
   at System.Text.Json.ThrowHelper.ThrowJsonReaderException(Utf8JsonReader& json, ExceptionResource resource, Byte nextByte, ReadOnlySpan`1 bytes)
   at System.Text.Json.Utf8JsonReader.ConsumeStringAndValidate(ReadOnlySpan`1 data, Int32 idx)
   at System.Text.Json.Utf8JsonReader.ConsumeString()
   at System.Text.Json.Utf8JsonReader.ConsumeValue(Byte marker)
   at System.Text.Json.Utf8JsonReader.ReadSingleSegment()
   at System.Text.Json.Utf8JsonReader.Read()
   at System.Text.Json.JsonDocument.Parse(ReadOnlySpan`1 utf8JsonSpan, JsonReaderOptions readerOptions, MetadataDb& database, StackRowStack& stack)
   at System.Text.Json.JsonDocument.Parse(ReadOnlyMemory`1 utf8Json, JsonReaderOptions readerOptions, Byte[] extraRentedArrayPoolBytes, PooledByteBufferWriter extraPooledByteBufferWriter)
   at System.Text.Json.JsonDocument.ParseAsyncCore(Stream utf8Json, JsonDocumentOptions options, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   --- End of inner exception stack trace ---
   at FsHttp.Response.catchHandler@1[a](String parserName, Utf8StringBufferingStream bufferingStream, Exception ex)
   at FsHttp.Response.parseAsync@37-9.Invoke(Exception edi)
   at Microsoft.FSharp.Control.AsyncPrimitives.CallFilterThenInvoke[T](AsyncActivation`1 ctxt, FSharpFunc`2 filterFunction, ExceptionDispatchInfo edi) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 547
   at Microsoft.FSharp.Control.Trampoline.Execute(FSharpFunc`2 firstAction) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 148
--- End of stack trace from previous location ---
   at Microsoft.FSharp.Control.AsyncResult`1.Commit() in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 455
   at Microsoft.FSharp.Control.AsyncPrimitives.RunImmediate[a](CancellationToken cancellationToken, FSharpAsync`1 computation) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 1160
   at Microsoft.FSharp.Control.FSharpAsync.RunSynchronously[T](FSharpAsync`1 computation, FSharpOption`1 timeout, FSharpOption`1 cancellationToken) in D:\a\_work\1\s\src\FSharp.Core\async.fs:line 1507
   at <StartupCode$FSI_0106>.$FSI_0106.main@()
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)

In [69]:
open System

let createDateRanges (monthsToAdd: int) = 
    let startDate = DateTime(1945, 1, 1)

    Seq.unfold (fun (d: DateTime) -> 
        if d.Year < System.DateTime.Now.Year + 2 then 
            Some(d, d.AddMonths(monthsToAdd)) 
        else None
    ) startDate
    |> Seq.pairwise
    |> Seq.map (fun (startDate,endDate) -> startDate.AddDays(1), endDate)
    |> Seq.map (fun (startDate,endDate) -> startDate.ToString("yyyy/MM/dd"), endDate.ToString("yyyy/MM/dd"))
    |> Array.ofSeq
    
createDateRanges 10
|> Seq.take 10

index,value
,
,
,
,
,
,
,
,
,
,

Unnamed: 0,Unnamed: 1
Item1,1945.01.02
Item2,1945.11.01

Unnamed: 0,Unnamed: 1
Item1,1945.11.02
Item2,1946.09.01

Unnamed: 0,Unnamed: 1
Item1,1946.09.02
Item2,1947.07.01

Unnamed: 0,Unnamed: 1
Item1,1947.07.02
Item2,1948.05.01

Unnamed: 0,Unnamed: 1
Item1,1948.05.02
Item2,1949.03.01

Unnamed: 0,Unnamed: 1
Item1,1949.03.02
Item2,1950.01.01

Unnamed: 0,Unnamed: 1
Item1,1950.01.02
Item2,1950.11.01

Unnamed: 0,Unnamed: 1
Item1,1950.11.02
Item2,1951.09.01

Unnamed: 0,Unnamed: 1
Item1,1951.09.02
Item2,1952.07.01

Unnamed: 0,Unnamed: 1
Item1,1952.07.02
Item2,1953.05.01


In [15]:
http {
    GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
    query [
        "db", "pubmed"
        "term", "arabidopsis thaliana"
        "retmode", "json"
        "retmax", "100000"
    ]
}
|> Request.send
|> Response.toJson
|> fun json -> json?esearchresult?count

In [70]:
open System.Threading

let getArabidopsisIdsBatched (batchSizeMonths: int) =
    let date_ranges = createDateRanges batchSizeMonths
    date_ranges
    |> Array.ofSeq
    |> Array.mapi (fun i (startDate,endDate) ->
        //Thread.Sleep(333)
        if i % 10 = 0 then printfn $"{i+1}/{date_ranges.Length}"
        if i = date_ranges.Length - 1 then printfn "last one!"
        let response = 
            http {
                GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
                query [
                    "db", "pubmed"
                    "term", "arabidopsis thaliana"
                    "retmode", "json"
                    "retmax", "9999"
                    "mindate", startDate
                    "maxdate", endDate
                ]
            }
            |> Request.send
            |> Response.toJson

        let count = 
            try 
                response?esearchresult?count.GetString() |> int
            with _ ->
                printfn "error parsing response: %A" response
                -1
        if count > 9999 then printfn $"still too large of a query: {startDate}-{endDate}: {count}"
        response
    )

In [27]:
let ids_batched_24_Months = getArabidopsisIdsBatched 24

1/39
11/39
21/39
31/39
still too large of a query: 2009.01.02-2011.01.01: 10723
still too large of a query: 2011.01.02-2013.01.01: 12611
still too large of a query: 2013.01.02-2015.01.01: 13364
still too large of a query: 2015.01.02-2017.01.01: 13269
still too large of a query: 2017.01.02-2019.01.01: 13599
still too large of a query: 2019.01.02-2021.01.01: 14738
last one!
still too large of a query: 2021.01.02-2023.01.01: 11981


In [33]:
let ids_batched_6_Months = getArabidopsisIdsBatched 3

1/319
11/319
21/319
31/319
41/319
51/319
61/319
71/319
81/319
91/319
101/319
111/319
121/319
131/319
141/319
151/319
161/319
171/319
181/319
191/319
201/319
211/319
221/319
231/319
241/319
251/319
261/319
271/319
281/319
291/319
301/319
still too large of a query: 2021.10.02-2022.01.01: 10027
311/319
last one!


In [68]:
http {
    GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
    query [
        "db", "pubmed"
        "term", "arabidopsis thaliana"
        "retmode", "json"
        "retmax", "9999"
        "retstart", "0"
        "mindate", "2022.10.02"
        "maxdate", "2022.11.01"
    ]
}
|> Request.send
|> Response.toJson

In [71]:
let ids =
    ids_batched_6_Months
    |> Array.map (fun r ->
        r?esearchresult?idlist.Deserialize<string []>()
    )
    |> Array.concat
    |> Array.distinct
ids.Length

In [20]:
open System.IO

let abstracts = 
    ids
    |> Array.chunkBySize 200
    |> Array.mapi (fun i chunk ->
        if i % 25 = 0 then printfn $"{i * 200} / {ids.Length}"
        if File.Exists($"./abstracts/abstracts_{i*200}_{(i + 1) * 200}.txt") then 
            printfn $"skipping {i*200}_{(i + 1) * 200}"
        else
            let conc = chunk |> String.concat ","
            http { 
                GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
                query [
                    "db", "pubmed"
                    "id", conc
                    "retmode", "json"
                    "rettype", "abstract"
                ] 
            }
            |> Request.send
            |> Response.toText
            |> fun abstracts -> File.WriteAllText($"./abstracts/abstracts_{i*200}_{(i + 1) * 200}.txt", abstracts)
    )

0 / 90470
skipping 0_200
skipping 200_400
skipping 400_600
skipping 600_800
skipping 800_1000
skipping 1000_1200
skipping 1200_1400
skipping 1400_1600
skipping 1600_1800
skipping 1800_2000
skipping 2000_2200
skipping 2200_2400
skipping 2400_2600
skipping 2600_2800
skipping 2800_3000
skipping 3000_3200
skipping 3200_3400
skipping 3400_3600
skipping 3600_3800
skipping 3800_4000
skipping 4000_4200
skipping 4200_4400
skipping 4400_4600
skipping 4600_4800
skipping 4800_5000
5000 / 90470
skipping 5000_5200
skipping 5200_5400
skipping 5400_5600
skipping 5600_5800
skipping 5800_6000
skipping 6000_6200
skipping 6200_6400
skipping 6400_6600
skipping 6600_6800
skipping 6800_7000
skipping 7000_7200
skipping 7200_7400
skipping 7400_7600
skipping 7600_7800
skipping 7800_8000
skipping 8000_8200
skipping 8200_8400
skipping 8400_8600
skipping 8600_8800
skipping 8800_9000
skipping 9000_9200
skipping 9200_9400
skipping 9400_9600
skipping 9600_9800
skipping 9800_10000
10000 / 90470
skipping 10000_10200
sk

In [19]:
abstracts[0]


Med Abstr. 1945 Jul;63A:162-71.

A survey of 200 samples of pickles in Chengtu for intestinal pathogens.

LIAO SJ.

PMID: 21009879 [Indexed for MEDLINE]


Dtsch Gesundheitsw. 1947 Mar 1;2(5):173.

[Report on clinical observations in acute resorptive sodium nitrite poisoning 
due to consumption of broth containing meat].

[Article in German]

SCHULZE W, STEUDE K.

PMID: 20342470 [Indexed for MEDLINE]


Food Ind. 1948 Mar;20(3):350.

Pickle processing standardized by use of germicidal detergent.

BERNSTEIN HI, EPSTEIN S.

PMID: 18906492 [Indexed for MEDLINE]


Z Indukt Abstamm Vererbungsl. 1955;87(1):47-64.

[Case of superdominance of experimentally produced autotetraploids of 
Arabidopsis thaliana].

[Article in German]

WRICKE G.

PMID: 13312459 [Indexed for MEDLINE]


Nature. 1955 Aug 6;176(4475):260-1. doi: 10.1038/176260b0.

Biochemical mutations in the crucifer Arabidopsis thaliana (L.) Heynh.

LANGRIDGE J.

DOI: 10.1038/176260b0
PMID: 13244676 [Indexed for MEDLINE]


Z Vererbungsl