Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File name too long @ dir_s_mkdir - Max file name and Transliterate UTF-8 characters to ASCII - Babosa #354

Closed
Natshah opened this issue Dec 15, 2015 · 2 comments

Comments

@Natshah
Copy link

Natshah commented Dec 15, 2015

Hi,
I'm having this issue which I do use the spider options.

no paths defined in config, crawling from site root
creating new spider file

/usr/local/lib/ruby/2.1.0/fileutils.rb:250:in `mkdir': File name too long @ dir_s_mkdir - shots_nojs/thumbnails/ar__content%d8%b4%d8%b1%d9%83%d8%a9-agip-%d9%87%d9%8a-%d9%85%d8%b3%d8%ac%d9%84-%d9%85%d8%b9%d8%aa%d9%85%d8%af-%d8%ac%d8%af%d9%8a%d8%af-%d9%84%d8%af%d9%8a%d9%86%d8%a7-%d9%84%d9%84%d9%86%d8%b7%d8%a7%d9%82%d8%a7%d8%aa-%d8%a7%d9%84%d9%82%d8%b7%d8%b1%d9%8a%d8%a9-comqa-%d9%88-netqa-%d9%88-nameqa-%d9%88-qa-%d9%88-%d9%82%d8%b7%d8%b1

We do need to Validate and limit the max length of the name and have a Transliterate UTF-8 characters to ASCII before Creating of files and directories

This sample code could help to get the case when we run the spider.

#!/usr/bin/env ruby
require 'thor'
require "babosa"

class DirWorker < Thor
  include Thor::Actions

  def self.source_root
    File.expand_path("../../../", __FILE__)
  end

  desc "create NAME", "To create a directory"
  def create_directory(name)
    FileUtils.mkdir_p(name.to_slug.transliterate.truncate_bytes(100).to_s)
  end
end
DirWorker.start(ARGV)

If we run this command:
ruby DirWorker.rb create "ar__content%d8%b4%d8%b1%d9%83%d8%a9-agip-%d9%87%d9%8a-%d9%85%d8%b3%d8%ac%d9%84-%d9%85%d8%b9%d8%aa%d9%85%d8%af-%d8%ac%d8%af%d9%8a%d8%af-%d9%84%d8%af%d9%8a%d9%86%d8%a7-%d9%84%d9%84%d9%86%d8%b7%d8%a7%d9%82%d8%a7%d8%aa-%d8%a7%d9%84%d9%82%d8%b7%d8%b1%d9%8a%d8%a9-comqa-%d9%88-netqa-%d9%88-nameqa-%d9%88-qa-%d9%88-%d9%82%d8%b7%d8%b1"

This will work .. But if we take out truncate_bytes

Thank you.

@Natshah
Copy link
Author

Natshah commented Dec 15, 2015

My solution for this, we need to add our Validation for the Wraith::FolderManager to transliterate truncate_bytes.
We need to have the transliterate for babosa:

# encoding: utf-8
module Babosa
  module Transliterator
    class Arabic < Base
      APPROXIMATIONS = {


      }

      def transliterate(string)
        super.gsub(/(c)z([ieyj])/) { "#{$1}#{$2}" }
      end
    end
  end
end

And go and Fork Wraith, then add some custom config functions and load the Arabic transliterate truncate_bytes for where we create or use files and directories.

@ChrisBAshton
Copy link
Contributor

Spidering has been overhauled in Wraith v4. Please re-open if this is still an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants