# Internal Links

Create dynamic Internal Links for your website by leveraging the semantic data in your graph.

> ℹ️ **Info**
>
> Internal links are hyperlinks that connect one page of a website to another page on the same domain. They facilitate user navigation, establish the site's information hierarchy, and distribute link equity across pages, thereby enhancing SEO performance. By constructing a semantic graph—a network that illustrates the relationships between various pages and topics—websites can create more meaningful internal links, aiding search engines in comprehending content relevance and context, which can lead to improved indexing and higher search rankings.

This notebook is going to query the website's graph and create internal links for each web page. The results are written back to the graph by using the `seovoc:LinkGroup`.

This is the second notebook of a series: head to the first episode to [create your graph first in less than 5 minutes](getting_started_notebook.ipynb).

## GraphQL Example

It is possible to query the internal links using GraphQL

### Request

This is the GraphQL query:

```graphql
query($url: String!) {
  entities(query: { urlConstraint: { in: [$url] } }) {
    iri
    link_groups: resources(name: "seovoc:hasLinkGroup") {
      iri
      identifier: string(name: "schema:identifier")
      name: string(name: "schema:name")
      has_link: resources(name: "seovoc:hasLink") {
        iri
        position: int(name: "schema:position")
        anchor_text: string(name: "seovoc:anchorText")
        anchor_value: ref(name: "seovoc:anchorValue")
        anchor_resource: resource(name: "seovoc:anchorResource") {
          iri
          abstract: string(name:"schema:abstract")
          keywords: strings(name:"schema:keywords")
          published_at: dateTime(name:"schema:datePublished")
        }
      }
    }
  }
}
```

This is the full CURL:

```shell
## Request
curl -X "POST" "https://api.wordlift.io/graphql" \
     -H 'Authorization: Key ...' \
     -H 'X-Include-Private: true' \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
	"query": "query($url:String!){entities(query:{urlConstraint:{in:[$url]}}){iri link_groups:resources(name:\\"seovoc:hasLinkGroup\\"){iri identifier:string(name:\\"schema:identifier\\")name:string(name:\\"schema:name\\")has_link:resources(name:\\"seovoc:hasLink\\"){iri position:int(name:\\"schema:position\\")anchor_text:string(name:\\"seovoc:anchorText\\")anchor_value:ref(name:\\"seovoc:anchorValue\\")anchor_resource:resource(name:\\"seovoc:anchorResource\\"){iri abstract:string(name:\\"schema:abstract\\")keywords:strings(name:\\"schema:keywords\\")published_at:dateTime(name:\\"schema:datePublished\\")}}}}}",
	"variables": {
		"url": "https://wordlift.io/blog/en/google-carousel-seo/"
	}
}'
```

### Response

This is the JSON response (a fragment):

```json
{
  "data": {
    "entities": [
      {
        "iri": "https://data.wordlift.io/wl1505904/google-carousel-seo-powerful-leverage-for-your-strategy-f70e79a1373ddb57776b0f533dbc3c91",
        "link_groups": [
          {
            "iri": "https://data.wordlift.io/wl1505904/google-carousel-seo-powerful-leverage-for-your-strategy-f70e79a1373ddb57776b0f533dbc3c91/linkgroup_getting_started",
            "identifier": "getting_started",
            "name": "Related Links",
            "has_link": [
              {
                "iri": "https://data.wordlift.io/wl1505904/google-carousel-seo-powerful-leverage-for-your-strategy-f70e79a1373ddb57776b0f533dbc3c91/linkgroup_getting_started/link_6",
                "position": 6,
                "anchor_text": "Smarter Graph",
                "anchor_value": "https://wordlift.io/blog/en/knowledge-graph-seo/",
                "anchor_resource": {
                  "iri": "https://data.wordlift.io/wl1505904/build-a-smarter-knowledge-graph-to-boost-seo---wordlift-blog-726261fe20cc38ef67d3fe00a66a8c05",
                  "abstract": "A knowledge graph for SEO provides relevant facts to search engines: this helps editors and business owners maximise search rankings and SERP visibility.",
                  "keywords": [
                    "personal assistant search optimization",
                    "Wikidata",
                    "voice search",
                    "Google's",
                    "knowledge graph",
                    "SEO",
                    "knowledge panels",
                    "semantic web",
                    "metadata"
                  ],
                  "published_at": "2024-10-01T00:00:00Z"
                }
              },
...
```

## Configuration

There are two configuration sources, at least one of the two is needed, and they're applied in order:

1. A file config/default.py
2. Local constants and WordLift Key in Google Colab Secrets

There's only one configuration settings:

* `WORDLIFT_KEY`, holding the WordLift Key, when using Google Colab, it can be set in the secrets

In [None]:
import logging

logging.basicConfig(level=logging.WARNING, force=True)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# Configuration from config/default.py file.
try:
    # Configuration is in the `config/default.py` file.
    from config import glasses_com_gl_us as config

    WORDLIFT_KEY = config.WORDLIFT_KEY
except ImportError:
    logging.warning("Cannot import configuration from local `config/default.py` file.")

# Configuration from Google Colab Secrets.
try:
    from google.colab import userdata

    WORDLIFT_KEY = userdata.get('WORDLIFT_KEY')
except ImportError:
    logging.warning("Cannot import configuration from google.colab.usermap.")

if WORDLIFT_KEY is None:
    raise ValueError('Configuration not set')

# Dependencies

This part is only for Google Colab. When the notebook is used locally we recommend using `poetry install`.

In [None]:
import sys

if "google.colab" in sys.modules:
    !pip install \
    "tqdm>=4.67.1,<5.0.0" \
    "wordlift-sdk @ git+https://github.com/wordlift/python-sdk.git"

# Imports

This section provides general imports and basic configuration, no need to do anything here.

In [None]:
from tqdm.asyncio import tqdm
from wordlift_sdk.client import ClientConfigurationFactory
from wordlift_sdk.utils import delayed, create_dataframe_of_entities_with_embedding_vectors
from wordlift_sdk.internal_link import create_internal_link_handler

# Defining the host is optional and defaults to https://api.wordlift.io
# See configuration.py for a list of all supported configuration parameters.
api_url = 'https://api.wordlift.io'
configuration = ClientConfigurationFactory(key=WORDLIFT_KEY).create()

# Main Function

This is the main notebook function code.

## How does it work

We query the graph for all the entities that have embedding vectors, the results are stored in `iri`, `url` pairs in the `entities_with_embedding_vectors_df` dataframe.

We then use the SDK's `create_internal_link_handler` method to pass the Client configuration and the ID of the Link Group that we want to create.

We can optionally provide an `internal_link_request_filter` method with the following signature `Callable[[Series, InternalLinkRequest], Awaitable[InternalLinkRequest]` to alter the actual request with additional filters (for example we may want to filter by at least a matching keyword shared between the source web page and the target web page).

The results are going to be written to the graph using schema.org and [seontology](https://github.com/seontology/seontology/).

In [None]:
async def main() -> None:
    entities_with_embedding_vectors_df = await create_dataframe_of_entities_with_embedding_vectors(WORDLIFT_KEY)

    # We're polite and not making more than 2 concurrent reqs.
    handler = create_internal_link_handler(configuration, 'getting_started')
    await tqdm.gather(
        *[delayed(handler, 2)(row) for index, row in entities_with_embedding_vectors_df.iterrows()],
        total=len(entities_with_embedding_vectors_df)
    )

    # Print the ID of the entities processed
    for index, row in entities_with_embedding_vectors_df.iterrows():
        logger.info(row['url'] + " " + row['iri'])


await main()
