Kanna(鉋) is an XML/HTML parser for Swift.
Swift HTML Objective-C Ruby
Clone or download

README.md

Kanna(鉋)

Kanna(鉋) is an XML/HTML parser for cross-platform(macOS, iOS, tvOS, watchOS and Linux!).

It was inspired by Nokogiri(鋸).

Build Status Platform Cocoapod Carthage compatible Swift Package Manager Reference Status

ℹ️ Documentation

Features:

  • XPath 1.0 support for document searching
  • CSS3 selector support for document searching
  • Support for namespaces
  • Comprehensive test suite

Installation:

Swift 4

CocoaPods

⚠️ CocoaPods (1.1.0 or later) is required.

Adding it to your Podfile:

use_frameworks!
pod 'Kanna', '~> 4.0.0'
Carthage

Adding it to your Cartfile:

github "tid-kijyun/Kanna" ~> 4.0.0
  1. In the project settings add $(SDKROOT)/usr/include/libxml2 to the "header search paths" field
Swift Package Manager

Installing libxml2 to your computer:

// macOS
$ brew install libxml2
$ brew link --force libxml2

// Linux(Ubuntu)
$ sudo apt-get install libxml2-dev

Adding it to your Package.swift:

// swift-tools-version:4.0
import PackageDescription

let package = Package(
    name: "YourProject",
    dependencies: [
        .package(url: "https://github.com/tid-kijyun/Kanna.git", from: "4.0.0")
    ],
    targets: [
        .target(
            name: "YourTarget",
            dependencies: ["Kanna"]),
    ]
)
$ swift build

Note: When a build error occurs, please try run the following command:

$ sudo apt-get install pkg-config
Manual Installation
  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
    Modules
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add $(SRCROOT)/Modules to the Swift Compiler - Search Paths > Import Paths field

Swift 3.0

For now, please use the feature/v3.0.0 branch.

CocoaPods

⚠️ CocoaPods (1.1.0 or later) is required.

Adding it to your Podfile:

use_frameworks!
pod 'Kanna', :git => 'https://github.com/tid-kijyun/Kanna', :branch => 'feature/v3.0.0'
Carthage

Adding it to your Cartfile:

github "tid-kijyun/Kanna" "feature/v3.0.0"
  1. In the project settings add $(SDKROOT)/usr/include/libxml2 to the "header search paths" field
Swift Package Manager

Installing libxml2 to your computer:

// macOS
$ brew install libxml2
$ brew link --force libxml2

// Linux(Ubuntu)
$ sudo apt-get install libxml2-dev

Adding it to your Package.swift:

import PackageDescription

let package = Package(
    name: "YourProject",
    
    dependencies: [
        .Package(url: "https://github.com/tid-kijyun/Kanna.git", majorVersion: 2)
    ]
)
$ swift build

Note: When a build error occurs, please try run the following command:

$ sudo apt-get install pkg-config
Manual Installation
  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
    Modules
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add $(SRCROOT)/Modules to the Swift Compiler - Search Paths > Import Paths field

Swift 2.x

Three means of installation are supported:

CocoaPods

⚠️ CocoaPods (0.39 or later) is required.

Adding it to your Podfile:

use_frameworks!
pod 'Kanna', '~> 1.1.0'
Carthage

Adding it to your Cartfile:

github "tid-kijyun/Kanna" ~> 1.1.0
Manual Installation
  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add -lxml2 to the Linking > Other Linker Flags field
  4. Import libxml headers:

Copy the those import statements:

#import <libxml/HTMLtree.h>
#import <libxml/xpath.h>
#import <libxml/xpathInternals.h>

and paste them into your [Modulename]-Bridging-Header.h

Note: With manual installation, this library doesn't need to be imported, or namespace-qualified in your code.

Synopsis:

import Kanna

let html = "<html>...</html>"

if let doc = try? HTML(html: html, encoding: .utf8) {
    print(doc.title)
    
    // Search for nodes by CSS
    for link in doc.css("a, link") {
        print(link.text)
        print(link["href"])
    }
    
    // Search for nodes by XPath
    for link in doc.xpath("//a | //link") {
        print(link.text)
        print(link["href"])
    }
}
let xml = "..."
if let doc = try? Kanna.XML(xml: xml, encoding: .utf8) {
    let namespaces = [
                    "o":  "urn:schemas-microsoft-com:office:office",
                    "ss": "urn:schemas-microsoft-com:office:spreadsheet"
                ]
    if let author = doc.at_xpath("//o:Author", namespaces: namespaces) {
        print(author.text)
    }
}

License:

The MIT License. See the LICENSE file for more infomation.