* Xpath also known as XML path, is a language to query XML documents.
* It is an important strategy to locate elements.
* It consists of a path expression along with some conditions.
* We can easily write an xpath script or query to locate any element in the web page.
* It is designed to allow the navigation of XML documents, with the purpose of selecting individual elements, attributes, or some other part of an XML document for specific processing.
* It also produces reliable locators.

# XML Document

```
<bookstore>
	<book category=“cooking”>
		<title lang=”en”> Everyday Chinese </title>
		<author> K.S. Bose </author>
	</book>
	<book category=“children”>
		<title lang=”en”> Harry Potter </title>
		<author> J. K Rowling </author>
	</book>
</bookstore>

```

* XML document has a tree-like structure. 
* XML documents consist of tags and attributes.

![image.png](attachment:795df6cd-5a12-4263-8eaf-06583764c12e.png)

# Syntax of XPATH

![image.png](attachment:5292db2d-b5b7-49d6-b842-83879fe38887.png)

**`//*[@attribute=’value’]`**
* `//`: It is used to select the current node.
* `*`: means any html tag.
* `@`: It is used to select to select attribute.
* **Attribute**: It is the name of the attribute of the node.
* **Value**: It is the value of the attribute


# Types of XPATH expression

There are two types of XPath:

**Absolute xpath expression (Using Single Slash)**:
* This mechanism is also known as finding elements using **Absolute XPath**.
* Single slash is used to create xpath with absolute path **i.e.** the xpath would be created to start selection from the document node/start node/parent node.
* It is the direct way to find the element, but the disadvantage of the **absolute XPath** is that, if there are any changes made in the path of the element then that xpath gets failed. 
* **For example**: `/html/body/div[1]/section/div[1]/div`

**Relative xpath expression (Using Double Slash)**:
* This mechanism is also known as finding elements using **Relative XPath**.
* Double slash is used to create xpath with a relative path **i.e.** the xpath would be created to start selection from anywhere within the document.
* For **Relative XPath**, the path starts from the middle of the HTML DOM structure and it begins with the double forward slash (`//`), which means it can search the element anywhere on the webpage. 
* **For example**: `//input[@id=‘ap_email’]`

![image.png](attachment:9c398033-0533-4a8d-93bc-66917c1abe84.png)

# XPATH functions

Xpath has some functions that can be used in xpath expression which helps to locate elements more efficiently when xpath of an element changes dynamically.

**Advantages of xpath functions:**
1. Makes locator efficient.
2. Makes xpath expression shorter.


# Types of XPATH functions

There are three most commonly used xpath functions:
1. contains()
2. starts-with()
3. text()

## `contains(@atrribute, ‘partial_value’)`

* This method is used in an xpath expression to locate those elements whose attribute value changes dynamically. 
* It helps locate an element using the partial value of an attribute of that element.

Consider the source code snippet:

![image.png](attachment:6bb57db7-4f82-4c47-8f83-7a11598f6ed8.png)

The **`src`** attribute contains the URL in its value, there are chances that its value or some part of the URL might change while you reload the page. So the bottom line here is, that a part of the attribute value is **static** while the rest is **dynamic**, in such cases, we generally prefer using the static part of the src attribute in the xpath expression to locate the **`img`** element.

**CONDITION**: The attribute value compulsory should have a static part that doesn’t change dynamically.

We can use the **`contains()`** method to **locate the element using the partial value of the attribute that is static**.

XPath query looks like: **`//img[contains(@src,’googleusercontent’)]`**

## `starts-with((@atrribute, ‘prefix_of_value’)`

* This function is used in xpath expression to find a web element whose value of an attribute changes on the refresh or on any other dynamic operation on the web page.
* In this, we match the starting text of the attribute to locate an element whose attribute has changed dynamically.

Consider the source code snippet:

![image.png](attachment:e81f8b8a-b4b7-4b80-9906-20ab5db83d89.png)

* As you can see in the figure **`src`** attribute starts with **https**. 
* It will locate the elements that start with **https**.

**CONDITION**: prefix or starting text of the attribute value must be static or constant

The **`starts-with()`** function is used to locate the element using **prefix or starting text of the attribute value that is static or constant**.

XPath query looks like: **`//img[starts-with(@src,'https')]`**

## `text()`

This function is used in xpath expression to locate an element with **exact text** (i.e. inner text of the element).

![image.png](attachment:89af0747-13c3-40ec-86d2-41985f362320.png)

**CONDITION**: Must contain the exact text value, irrespective of the tag.

XPath query looks like: **`//*[text()='Search Google or type a URL']`**

The **asterisk(*)** implies any tag with the same value.

**Also, we can use  `text()` function with `contains()` function**. 

The xpath query looks like: **`//*[contains(text(),’ Search Google or type a URL’]`**



## `last()`

Selects the last element.

**Syntax:**

**`(//tagname[@attribute=value])[last()]`**

**`(//tagname[@attribute=value])[last()-1]`**
	
**Example**:

**`(//div[@id=’xx’])[last()]`**

**`(//div[@id=’xx’])[last()-1]`**


## `position()`

Select the element out of the list of elements present depending on the **position number** provided.

**NOTE**: Position starts with 1.

**Syntax**:

**`(//tagname[@attribute=value])[ position()=2]`**

**`(//tagname[@attribute=value])[2]`**

**Example**:

**`(//div[@id=’xx’])[position()=2]`**

**`(//div[@id=’xx’])[2]`**

## Finding elements using index

By providing the index position in the square brackets, we could move to the nth element.

**NOTE**: Index starts with 1.

**Syntax**:

**`//tagname[@attribute=value][index]`**
	
**Example**:

**`//div[@id=’xx’][2]`**

# XPATH Chaining

We can **chain multiple relative XPath** declarations with **`//`** double slash to find an element location as shown:
```
//div[@id=’abc′]//a[@id=’xyz′]
```

# Using `and` & `or` operators

We can combine multiple attribute using  `and` & `or` operator as shown:

**`//a[@id=’pt1:_UIScmi4′ or @class=’xnk xmi’]`**


**`//a[@id=’pt1:_UIScmi4′ and @class=’xnk xmi’]`**


# XPATH Axes

An **axis** represents a **relationship to the context (current) node** and is used to locate nodes relative to that node on the tree.

![image.png](attachment:c21d56c6-afda-444a-b4b5-ca2f7f42c8dc.png)

The major XPath axes follow family tree terminology:

* **`self::`** is you.

**Downward**:
* **`child::`** are your immediate children.
* **`descendant::`** are your children, and their children, recursively.
* **`descendant-or-self::`** (aka //): are you and your descendants.

**Upward**:
* **`parent::`** is your mother or father.1
* **`ancestor::`** are your parent, and your parent's parent, recursively.
* **`ancestor-or-self::`** are you and your ancestors.

**Sideways (consider elements earlier in the document to be younger)**:
* **`previous-sibling::`** are your younger siblings, in age order.
* **`following-sibling::`** are your older siblings, in age order.
* **`previous::`** are your younger siblings and their descendants, in age order.
* **`following::`** are your older siblings and their descendants, in age order.

## `following-sibling`

**Select the following siblings of the context node (current node)**.

![image.png](attachment:f7e33998-d8a5-482d-a6cb-5cd522d6f8f0.png)

Syntax: **`//tagname[@attribute=value]//following-sibling::tagname`**

Example: **`//span[@class=’xnu’]/ancestor::div[@id=’pt1:_USSpgl5′]/following-sibling::div`**

In the above example, we are trying to access all menus under **Administration**.


## `following`

* Starts to locate elements after the current node **at the same level**, that is, `following-sibling & their descendants`. 
* It finds the element before the following statement, and sets it as the top node and then starts to find all elements after that node.

Syntax: **`//tagname[@attribute=value]//following::tagname`**

Example: **`//div[@id=’xx’]//following::input`

So basically the search will start from **`div`** whose **`id=’xx’`** and search all elements with **`tagname =’input’`** following the **`div`** tag.

## `child`

Selects all **direct children** elements of the current node (context node).

Syntax: **`//tagname[@attribute=value]//child::tagname`**

Example: **`//div[@id=’xx’]//child::input`**

So basically the search will start from **`div`** whose **`id=’xx’`** and search all elements with **`tagname =’input’`** which are **direct child** of the **`div`** tag.


## `preceding`

Selects all nodes that comes before the current node (context node) **at the same level**, that is, `preceding-sibling & their descendants`. 

Syntax: **`//tagname[@attribute=value]//preceding::tagname`**

Example: **`//div[@id=’xx’]//preceding::input`**

So basically the search will start from **`div`** whose **`id=’xx’`** and search all the elements with **`tagname =’div’`** which are **preceded** by the **`input`** tag.
