Skip to content

Latest commit

 

History

History
114 lines (98 loc) · 2.94 KB

187. Repeated DNA Sequences.md

File metadata and controls

114 lines (98 loc) · 2.94 KB

leetcode Daily Challenge on October 17th, 2020.


Difficulty : Medium

Related Topics : HashTableBit Manipulation


All DNA is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T', for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]

Example 2:

Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]

Constraints:

  • 0 <= s.length <= 10^5
  • s[i] is 'A', 'C', 'G', or 'T'.

Solution

  • mine
    • Java
      • Runtime: 15 ms, faster than 87.02%, Memory Usage: 47.4 MB, less than 13.21% of Java online submissions
        //O(N)time
        //O(N)space
        public List<String> findRepeatedDnaSequences(String s) {
            int n = s.length();
            if(n <= 10) return new ArrayList<>();
            HashSet<String> set = new HashSet<>();
            HashSet<String> res = new HashSet<>();
            for(int i = 0; i <= n - 10; i++){
                String t = s.substring(i , i + 10);
                if(set.contains(t)){
                    if(!res.contains(t)){
                        res.add(t);
                    }
                }else{
                    set.add(t);
                }
            }
            return new ArrayList<>(res);
        }
        

  • the most votes
  • BitManipulation Runtime: 4 ms, faster than 99.63%, Memory Usage: 46.5 MB, less than 13.21% of Java online submissions
    // O(N)time
    // O(2^L)space
    // L = 10
    public List<String> findRepeatedDnaSequences(String s) {
        int len;
        if (s == null || (len = s.length()) < 10) {
            return Collections.emptyList();
        }
        int[] map = new int[128];
        map['A'] = 0;
        map['C'] = 1;
        map['G'] = 2;
        map['T'] = 3;
    
        boolean[] seenArr = new boolean[1024 * 1024];
        int buf = 0;
        char[] chars = s.toCharArray();
        for (int i = 0; i < 10; i++) {
            buf = (buf << 2) + map[chars[i]];
        }
        seenArr[buf] = true;
    
        boolean[] addedArr = new boolean[1024 * 1024];
        List<String> results = new ArrayList<>(len / 2);
        for (int i = 10; i < len; i++) {
            buf = ((buf << 2) & 0xFFFFF) + map[chars[i]];
            if (seenArr[buf]) {
                if (!addedArr[buf]) {
                    results.add(new String(chars, i - 9, 10));
                    addedArr[buf] = true;
                }
            } else {
                seenArr[buf] = true;
            }
        }
        return results;
    }