Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vCard/JSON extraction #19443

Closed
wants to merge 32 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
eb2c7e0
Merge pull request #11 from servo/master
niravjain Nov 13, 2017
015ef37
Merge pull request #12 from servo/master
CJ8664 Nov 22, 2017
43aeb7c
Added microdata module
niravjain Nov 24, 2017
f306835
Code to send msg from servo to servoshell (EmbedderMsg)
CJ8664 Nov 25, 2017
0530512
Merge branch 'master' of https://github.com/CJ8664/servo
CJ8664 Nov 25, 2017
f939632
naming convention
CJ8664 Nov 25, 2017
3d68d2a
trying serde_json crate
CJ8664 Nov 25, 2017
a4ed961
Uploading erroneous code
CJ8664 Nov 25, 2017
4628943
Uploading erroneous code
CJ8664 Nov 25, 2017
cbfc666
Merge remote-tracking branch 'refs/remotes/origin/serde_try' into ser…
CJ8664 Nov 25, 2017
146dd58
Merge pull request #13 from CJ8664/serde_try
CJ8664 Nov 25, 2017
9284187
serde json working for Hashmap
CJ8664 Nov 25, 2017
dd4dc70
Updated the servo-shell communication
CJ8664 Nov 29, 2017
43650c2
vCard working
CJ8664 Nov 29, 2017
19408a0
Adding adr to vCard
niravjain Nov 30, 2017
79d140d
Adding adr to vCard
niravjain Nov 30, 2017
aab6900
Merged vcard and json logic
CJ8664 Nov 30, 2017
35468f8
Fixed tidy errors
niravjain Nov 30, 2017
8cafd64
Updated code to pass the type of microdata as a parameter
CJ8664 Dec 1, 2017
87d1efa
Merge branch 'vcard' of https://github.com/CJ8664/servo into vcard
CJ8664 Dec 1, 2017
c580c3f
Added code to notify user via change in title
CJ8664 Dec 1, 2017
3cd9731
Created dummy test cases
CJ8664 Dec 1, 2017
d0cb12e
Removed JSON code and replaced with a stub
CJ8664 Dec 1, 2017
0ae2d08
Merge pull request #14 from CJ8664/vcard
CJ8664 Dec 1, 2017
7131823
Merge pull request #15 from servo/master
CJ8664 Dec 1, 2017
a3e5b9f
Updated manifest
CJ8664 Dec 1, 2017
0e5023d
Partially updated the code based on reviews
CJ8664 Dec 2, 2017
37f70c6
Fixed lint issues
CJ8664 Dec 2, 2017
543de1c
Fixed lint issues
CJ8664 Dec 2, 2017
a781bba
Merge fix
CJ8664 Dec 12, 2017
04a043d
Merge branch 'servo-master'
CJ8664 Dec 12, 2017
c740aab
Rebuild cargo
CJ8664 Dec 12, 2017
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Partially updated the code based on reviews

  • Loading branch information
CJ8664 committed Dec 2, 2017
commit 0e5023d61dd647552b898603a2ef62d7e3239cb7
@@ -4,6 +4,7 @@

//! Communication with the compositor thread.

use script_traits::Microdata;
use SendableFrameTree;
use compositor::CompositingReason;
use euclid::{Point2D, Size2D};
@@ -146,8 +147,10 @@ pub enum EmbedderMsg {
LoadStart(TopLevelBrowsingContextId),
/// The load of a page has completed
LoadComplete(TopLevelBrowsingContextId),
/// Sends microdata
SendMicrodata(String, String),
/// Sends the extracted microdata from webpage.
/// The parameter is an enum containing either VCardData or JSONData.
/// These entires have a String that represents the actual microdata
SendMicrodata(Microdata),
}

/// Messages from the painting thread and the constellation thread to the compositor thread.
@@ -11,7 +11,7 @@ use gleam::gl;
use ipc_channel::ipc::IpcSender;
use msg::constellation_msg::{Key, KeyModifiers, KeyState, TopLevelBrowsingContextId, TraversalDirection};
use net_traits::net_error_list::NetError;
use script_traits::{LoadData, MouseButton, TouchEventType, TouchId, TouchpadPressurePhase};
use script_traits::{LoadData, Microdata, MouseButton, TouchEventType, TouchId, TouchpadPressurePhase};
use servo_geometry::DeviceIndependentPixel;
use servo_url::ServoUrl;
use std::fmt::{Debug, Error, Formatter};
@@ -194,5 +194,5 @@ pub trait WindowMethods {
fn set_animation_state(&self, _state: AnimationState) {}

/// Print Microdata on the Console or write to file
fn print_microdata(&self, _data: String, _datatype: String) {}
fn write_microdata(&self, _data: Microdata) {}
}
@@ -1321,8 +1321,8 @@ impl<Message, LTF, STF> Constellation<Message, LTF, STF>
FromScriptMsg::SetFullscreenState(state) => {
self.embedder_proxy.send(EmbedderMsg::SetFullscreenState(source_top_ctx_id, state));
}
FromScriptMsg::SendMicrodata(data, datatype) => {
self.embedder_proxy.send(EmbedderMsg::SendMicrodata(data, datatype));
FromScriptMsg::SendMicrodata(result) => {
self.embedder_proxy.send(EmbedderMsg::SendMicrodata(result));
}
}
}
@@ -99,7 +99,7 @@ use ipc_channel::ipc::{self, IpcSender};
use js::jsapi::{JSContext, JSRuntime};
use js::jsapi::JS_GetRuntime;
use metrics::{InteractiveFlag, InteractiveMetrics, InteractiveWindow, ProfilerMetadataFactory, ProgressiveWebMetric};
use microdata::Microdata;
use microdata;
use msg::constellation_msg::{BrowsingContextId, Key, KeyModifiers, KeyState, TopLevelBrowsingContextId};
use net_traits::{FetchResponseMsg, IpcSend, ReferrerPolicy};
use net_traits::CookieSource::NonHTTP;
@@ -1714,15 +1714,10 @@ impl Document {

// Step 13.

This comment has been minimized.

Copy link
@jdm

jdm Dec 1, 2017

Member

The comments here refer to actual step numbers defined in the HTML specification. We should not add ones for steps that do not exist :)

This comment has been minimized.

Copy link
@CJ8664

CJ8664 Dec 2, 2017

Author

The previous comments were already present, we just added step 13 for microdata as discussed our logic starts after page completes loading.

This comment has been minimized.

Copy link
@jdm

jdm Dec 2, 2017

Member

Right, but the comments including steps refer to the steps in https://html.spec.whatwg.org/multipage/#the-end . There is no step there about microdata; that's something we're adding that is not part of the specification.

let htmlelement = self.get_html_element();
let result = Microdata::parse(self, htmlelement.unwrap().upcast::<Node>());
if !result.get("vcard").unwrap().is_empty() {
let event = ScriptMsg::SendMicrodata(result.get("vcard").unwrap().to_string(), "vcard".to_string());
let result = microdata::parse(self, htmlelement.unwrap().upcast::<Node>());
if let Some(data) = result {
let event = ScriptMsg::SendMicrodata(data);
self.send_to_constellation(event);
self.SetTitle(DOMString::from("Extracted vCard".to_string()));
} else if !result.get("json").unwrap().is_empty() {
let event = ScriptMsg::SendMicrodata(result.get("json").unwrap().to_string(), "json".to_string());
self.send_to_constellation(event);
self.SetTitle(DOMString::from("Extracted JSON".to_string()));
}
}

@@ -923,10 +923,6 @@ impl Element {
&self.local_name
}

pub fn tag_name(&self) -> DOMString {
self.TagName()
}

pub fn parsed_name(&self, mut name: DOMString) -> LocalName {
if self.html_element_in_html_document() {
name.make_ascii_lowercase();
@@ -85,7 +85,6 @@ extern crate script_layout_interface;
extern crate script_traits;
extern crate selectors;
extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_json;
extern crate servo_allocator;
@@ -123,7 +122,7 @@ mod dom;
pub mod fetch;
mod layout_image;
mod mem;
mod microdata;
pub mod microdata;
mod microtask;
mod network_listener;
pub mod script_runtime;
@@ -6,156 +6,153 @@ use dom::bindings::codegen::Bindings::DocumentBinding::DocumentBinding::Document
use dom::bindings::codegen::Bindings::ElementBinding::ElementBinding::ElementMethods;
use dom::bindings::inheritance::Castable;
use dom::bindings::root::DomRoot;
use dom::characterdata::CharacterData;
use dom::document::Document;
use dom::element::Element;
use dom::htmlelement::HTMLElement;
use dom::node::Node;
use dom::text::Text;
use serde_json;
use std::borrow::Cow;
use std::collections::HashMap;

pub struct Microdata {}

impl Microdata {
pub fn parse(doc: &Document, node: &Node) -> HashMap<String, String> {
let serialized_vcard = Self::parse_vcard(doc);
let serialized_json = Self::parse_json(node);
let mut serialized_data: HashMap<String, String> = HashMap::new();
serialized_data.insert("vcard".to_string(), serialized_vcard);
serialized_data.insert("json".to_string(), serialized_json);
return serialized_data;
use script_traits::Microdata;

pub fn parse(doc: &Document, node: &Node) -> Option<Microdata> {
let serialized_vcard = parse_vcard(doc);
let serialized_json = parse_json(node);
if !serialized_vcard.is_empty() {
return Some(Microdata::VCardData(serialized_vcard.to_owned()));
} else if !serialized_vcard.is_empty(){
return Some(Microdata::JSONData(serialized_json.to_owned()));
} else {
return None
}
}

pub fn parse_vcard(doc: &Document) -> String {
let ele = doc.upcast::<Node>();
let mut start_vcard = false;
let mut result: String = String::new();
let mut master_map: HashMap<String, HashMap<String, String>> = HashMap::new();
let mut master_key: String = String::new();
pub fn parse_vcard(doc: &Document) -> String {
let ele = doc.upcast::<Node>();
let mut start_vcard = false;
let mut result: String = String::new();
let mut master_map: HashMap<String, HashMap<String, String>> = HashMap::new();
let mut master_key: String = String::new();

result += "BEGIN:VCARD\nPROFILE:VCARD\nVERSION:4.0\nSOURCE:";
result += doc.url().as_str();
result += "BEGIN:VCARD\nPROFILE:VCARD\nVERSION:4.0\nSOURCE:";
result += doc.url().as_str();

let title = doc.Title();
if !title.is_empty() && !title.trim().is_empty() {
result += "\nNAME:";
result += title.trim();
}
let title = doc.Title();
if !title.is_empty() && !title.trim().is_empty() {
result += "\nNAME:";
result += title.trim();
}

result += "\n";

for element in ele.traverse_preorder().filter_map(DomRoot::downcast::<Element>) {
if element.is::<HTMLElement>() {
if element.has_attribute(&local_name!("itemtype")) {
let mut atoms = element.get_tokenlist_attribute(&local_name!("itemtype"), );
if !atoms.is_empty() {
let val = atoms.remove(0);
if val.trim() == "http://microformats.org/profile/hcard" {
if !start_vcard {
start_vcard = true;
} else {
break;
}
result += "\n";

for element in ele.traverse_preorder().filter_map(DomRoot::downcast::<Element>) {
if element.is::<HTMLElement>() {
if element.has_attribute(&local_name!("itemtype")) {
let mut atoms = element.get_tokenlist_attribute(&local_name!("itemtype"), );
if !atoms.is_empty() {
let val = atoms.remove(0);
if val.trim() == "http://microformats.org/profile/hcard" {
if !start_vcard {
start_vcard = true;
} else {
break;
}
}
}
if start_vcard {
let mut atoms = element.get_tokenlist_attribute(&local_name!("itemprop"), );
if !atoms.is_empty() {
let temp_key = atoms.remove(0);
if element.has_attribute(&local_name!("itemscope")) {
master_key = String::from(temp_key.trim()).to_owned();
let dup_master_key = Cow::Borrowed(&master_key);
master_map.entry(dup_master_key.to_string()).or_insert(HashMap::new());
} else {
let temp = String::from(temp_key.trim()).to_owned();
let dup_key = Cow::Borrowed(&temp);
let data = String::from(element.GetInnerHTML().unwrap());
let dup_master_key = Cow::Borrowed(&master_key);
let temp_map = master_map.entry(dup_master_key.to_string()).or_insert(HashMap::new());
temp_map.insert(dup_key.to_string(), String::from(data));
}
}
if start_vcard {
let mut atoms = element.get_tokenlist_attribute(&local_name!("itemprop"), );
if !atoms.is_empty() {
let temp_key = atoms.remove(0);
if element.has_attribute(&local_name!("itemscope")) {
master_key = String::from(temp_key.trim()).to_owned();
let dup_master_key = Cow::Borrowed(&master_key);
master_map.entry(dup_master_key.to_string()).or_insert(HashMap::new());
} else {
let temp = String::from(temp_key.trim()).to_owned();
let dup_key = Cow::Borrowed(&temp);
let data = String::from(element.GetInnerHTML().unwrap());
let dup_master_key = Cow::Borrowed(&master_key);
let temp_map = master_map.entry(dup_master_key.to_string()).or_insert(HashMap::new());
temp_map.insert(dup_key.to_string(), String::from(data));
}
}
}
}
let vcard_parts = ["n", "org", "tel", "adr"];
for info_type in vcard_parts.iter() {
let detail_map_val = master_map.get(*info_type);
if detail_map_val.is_none() {
continue;
}
let detail_map = detail_map_val.unwrap();
match *info_type {
"n" => {
let mut n_value: String = String::new();

let name_parts = ["family-name", "given-name",
"additional-name", "honorific-prefix", "honorific-suffix"];
for part in name_parts.iter() {
if detail_map.contains_key(*part) {
n_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
}
let vcard_parts = ["n", "org", "tel", "adr"];
for info_type in vcard_parts.iter() {
let detail_map_val = master_map.get(*info_type);
if detail_map_val.is_none() {
continue;
}
let detail_map = detail_map_val.unwrap();
match *info_type {
"n" => {
let mut n_value: String = String::new();

let name_parts = ["family-name", "given-name",
"additional-name", "honorific-prefix", "honorific-suffix"];
for part in name_parts.iter() {
if detail_map.contains_key(*part) {
n_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
n_value.pop();
}
n_value.pop();

result += format!("{}:{}\n", info_type.to_ascii_uppercase(), n_value).as_str();
},
"org" => {
let mut org_value: String = String::new();
result += format!("{}:{}\n", info_type.to_ascii_uppercase(), n_value).as_str();
},
"org" => {
let mut org_value: String = String::new();

let org_parts = ["organization-name", "organization-unit"];
for part in org_parts.iter() {
if detail_map.contains_key(*part) {
org_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
let org_parts = ["organization-name", "organization-unit"];
for part in org_parts.iter() {
if detail_map.contains_key(*part) {
org_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
org_value.pop();
}
org_value.pop();

result += format!("{}:{}\n", info_type.to_ascii_uppercase(), org_value).as_str();
},
"tel" => {
let mut tel_value: String = String::new();
result += format!("{}:{}\n", info_type.to_ascii_uppercase(), org_value).as_str();
},
"tel" => {
let mut tel_value: String = String::new();

let tel_parts = ["value"];
for part in tel_parts.iter() {
if detail_map.contains_key(*part) {
tel_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
let tel_parts = ["value"];
for part in tel_parts.iter() {
if detail_map.contains_key(*part) {
tel_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
tel_value.pop();

result += format!("{}:{}\n", info_type.to_ascii_uppercase(), tel_value).as_str();
},
"adr" => {
let mut adr_value: String = String::new();

let adr_parts = ["street-address", "locality", "region", "postal-code",
"country-name", "post-office-box", "extended-address"];
for part in adr_parts.iter() {
if detail_map.contains_key(*part) {
adr_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
}
tel_value.pop();

result += format!("{}:{}\n", info_type.to_ascii_uppercase(), tel_value).as_str();
},
"adr" => {
let mut adr_value: String = String::new();

let adr_parts = ["street-address", "locality", "region", "postal-code",
"country-name", "post-office-box", "extended-address"];
for part in adr_parts.iter() {
if detail_map.contains_key(*part) {
adr_value += format!("{};", detail_map.get(*part).unwrap()).as_str();
}
adr_value.pop();
}
adr_value.pop();

result += format!("{}:{}\n", info_type.to_ascii_uppercase(), adr_value).as_str();
},
_ => {},
}
}
result += "END:VCARD";
if start_vcard {
return result;
} else {
return "".to_string();
result += format!("{}:{}\n", info_type.to_ascii_uppercase(), adr_value).as_str();
},
_ => {},
}
}

pub fn parse_json(node: &Node) -> String {
// TODO Write the logic for JSON Parsing
result += "END:VCARD";
if start_vcard {
return result;
} else {
return "".to_string();
}
}

pub fn parse_json(node: &Node) -> String {
// TODO Write the logic for JSON Parsing
return "".to_string();
}
@@ -161,6 +161,15 @@ pub enum JsEvalResult {
Ok(Vec<u8>)
}

/// The result of parsing microdata from a webpage
#[derive(Debug, Deserialize, Serialize)]
pub enum Microdata {
/// The String that has the vCard information
VCardData(String),
/// The String that has the JSON information
JSONData(String),
}

impl LoadData {
/// Create a new `LoadData` object.
pub fn new(url: ServoUrl,
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.